mirror of
https://github.com/GSA/notifications-api.git
synced 2026-01-02 05:41:57 -05:00
A recent issue with a long-running query (#2288) highlighted the fact that even though the original HTTP connection might be closed (for example after gorouter timeout of 15 minutes, which returns a 504 response to the client), the request worker will not be stopped. This means that the worker is spending time and potentially DB resources generating a response that will never be delivered. Gunicorn's timeout setting only applies to sync workers and there doesn't seem to be an option to interrupt individual requests in gevent/eventlet deployments. Since the most likely (and potentially most dangerous) scenario for this is a long-running DB query, we can set a statement timeout on our DB connections. This will raise a sqlalchemy.exc.OperationalError (wrapping psycopg2.extensions.QueryCanceledError), interrupting the request after the given timeout has been reached. This is a Postgres client setting, so the database itself will abort the transaction when it reaches the set timeout. Since this will also apply to our celery tasks (including potentially long-running nightly tasks) we set a timeout of 20 minutes to begin with. This can potentially be split in the future to set a different value for each app, so that we could limit API requests even more.