Files
notifications-api/scripts/paas_app_wrapper.sh
David McDonald a237162106 Reduce concurrency and prefetch count of reporting celery app
We have seen the reporting app run out of memory multiple times when
dealing with overnight tasks. The app runs 11 worker threads and we
reduce this to 2 worker threads to put less pressure on a single
instance.

The number 2 was chosen as most of the tasks processed by the reporting
app only take a few minutes and only one or two usually take more than
an hour. This would mean with 2 processes across our current 2
instances, a long running task should hopefully only wait behind a few
short running tasks before being picked up and therefore we shouldn't
see large increase in overall time taken to run all our overnight
reporting tasks.

On top of reducing the concurrency for the reporting app, we also set
CELERYD_PREFETCH_MULTIPLIER=1. We do this as suggested by the celery
docs because this app deals with long running tasks.
https://docs.celeryproject.org/en/3.1/userguide/optimizing.html#optimizing-prefetch-limit

The chance in prefetch multiplier should again optimise the overall time
it takes to process our tasks by ensuring that tasks are given to
instances that have (or will soon have) spare workers to deal with them,
rather than committing to putting all the tasks on certain workers in
advance.

Note, another suggestion for improving suggested by the docs for
optimising is to start setting `ACKS_LATE` on the long running tasks.
This setting would effectively change us from prefetching 1 task per
worker to prefetching 0 tasks per worker and further optimise how we
distribute our tasks across instances. However, we decided not to try
this setting as we weren't sure whether it would conflict with our
visibility_timeout. We decided not to spend the time investigating but
it may be worth revisiting in the future, as long as tasks are
idempotent.

Overall, this commit takes us from potentially having all 18 of our
reporting tasks get fetched onto a single instance to now having a
process that will ensure tasks are distributed more fairly across
instances based on when they have available workers to process the
tasks.
2020-04-28 10:47:46 +01:00

64 lines
2.7 KiB
Bash
Executable File

#!/bin/bash
case $NOTIFY_APP_NAME in
api)
unset GUNICORN_CMD_ARGS
exec scripts/run_app_paas.sh gunicorn -c /home/vcap/app/gunicorn_config.py application
;;
delivery-worker-retry-tasks)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q retry-tasks 2> /dev/null
;;
delivery-worker-letters)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q create-letters-pdf-tasks,letter-tasks 2> /dev/null
;;
delivery-worker-jobs)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q database-tasks,job-tasks 2> /dev/null
;;
delivery-worker-research)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=5 \
-Q research-mode-tasks 2> /dev/null
;;
delivery-worker-sender)
exec scripts/run_multi_worker_app_paas.sh celery multi start 3 -c 10 -A run_celery.notify_celery --loglevel=INFO \
-Q send-sms-tasks,send-email-tasks
;;
delivery-worker-periodic)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=2 \
-Q periodic-tasks 2> /dev/null
;;
delivery-worker-reporting)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=2 -Ofair \
-Q reporting-tasks 2> /dev/null
;;
delivery-worker-priority)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=5 \
-Q priority-tasks 2> /dev/null
;;
# Only consume the notify-internal-tasks queue on this app so that Notify messages are processed as a priority
delivery-worker-internal)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q notify-internal-tasks 2> /dev/null
;;
delivery-worker-receipts)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q ses-callbacks,sms-callbacks 2> /dev/null
;;
delivery-worker-service-callbacks)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q service-callbacks 2> /dev/null
;;
delivery-worker-save-api-notifications)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery worker --loglevel=INFO --concurrency=11 \
-Q save-api-email-tasks 2> /dev/null
;;
delivery-celery-beat)
exec scripts/run_app_paas.sh celery -A run_celery.notify_celery beat --loglevel=INFO
;;
*)
echo "Unknown notify_app_name $NOTIFY_APP_NAME"
exit 1
;;
esac