previously, it was too loose - checking `"name" in str(exc)` returns
false positives.
By changing from three if statements to a loop we can cut down on
unnecessary code (and ensure that the returned objects are consistent),
and by using the full check constraint name we can be sure that we're
only capturing exactly the right errors. Additionally, don't return
the original data in the error message - it's obvious what the name is
because it'll be populated in the form you just filled in.
However, until we can create a letter without a logo, we will still default to hm-government, because the dvla_organisation is set on the service.
This does simplify the code.
Also removed the inserts to letter_branding in the data migration file, because we can deploy this before the rest of the work is finished. But we will need to do it later.
We use exec to start awslogs_agent and then a tail to print logs to
stdout. CF docs[1] recommend to use exec to start processes which seems
to imply that as long as there are commands running the container will
remain up and running.
This commit ensures that if there are no celery tasks running we will
kill any other processes that we have started, so that the container will
no longer be considered healthy by cloudfoundry and will be replaced.
1: https://docs.cloudfoundry.org/devguide/deploy-apps/manifest.html#start-commands
In 4427827b2f and celery monitoring was
changed from using PID files to actually looking at processes.
If celery workers get OOM killed (for instance) the container init
script would not restart them, this is because `get_celery_pids` would
not contain any processes that contained the string celery. This would
cause the pipe to fail (-o pipefail). APP_PIDS would not get updated but
the script would continue to run. This caused the script to not restart
the celery processes.
We think the correct behaviour when celery processes are killed (i.e.
there are no more celery processes running in a container) is to kill
the container. The PaaS should then schedule new ones which may
remediate the cause of the celery processes being killed.
Upon detection of no celery processes running, some diagnostic
information from the environment is sent to the logs, e.g.:
```
CF_INSTANCE_ADDR=10.0.32.4:61012
CF_INSTANCE_INTERNAL_IP=10.255.184.9
CF_INSTANCE_GUID=81c57dbc-e706-411e-6a5f-2013
CF_INSTANCE_PORT=61012
CF_INSTANCE_IP=10.0.32.4
```
Then the script (which is the container entrypoint) exits 1.
Co-author: @servingupaces @tlwr
make a decorator that pings cronitor before and after each task run.
Designed for use with nightly tasks, so we have visibility if they
fail. We have a bunch of cronitor monitors set up - 5 character keys
that go into a URL that we then make a GET to with a self-explanatory
url path (run/fail/complete).
the cronitor URLs are defined in the credentials repo as a dictionary
of celery task names to URL slugs. If the name passed in to the
decorator isn't in that dict, it won't run.
to use it, all you need to do is call `@cronitor(my_task_name)`
instead of `@notify_celery.task`, and make sure that the task name and
the matching slug are included in the credentials repo (or locally,
json dumped and stored in the CRONITOR_KEYS environment variable)
This addresses some problems that existed in the previous approach:
1. There was a race condition that could occur between the time we were
looking for the existence of the .pid files and actually reading them.
2. If for some reason the .pid file was left behind after a process had
died, the script would never know because we do:
kill -s ${1} ${APP_PID} || true