make a decorator that pings cronitor before and after each task run.
Designed for use with nightly tasks, so we have visibility if they
fail. We have a bunch of cronitor monitors set up - 5 character keys
that go into a URL that we then make a GET to with a self-explanatory
url path (run/fail/complete).
the cronitor URLs are defined in the credentials repo as a dictionary
of celery task names to URL slugs. If the name passed in to the
decorator isn't in that dict, it won't run.
to use it, all you need to do is call `@cronitor(my_task_name)`
instead of `@notify_celery.task`, and make sure that the task name and
the matching slug are included in the credentials repo (or locally,
json dumped and stored in the CRONITOR_KEYS environment variable)
This addresses some problems that existed in the previous approach:
1. There was a race condition that could occur between the time we were
looking for the existence of the .pid files and actually reading them.
2. If for some reason the .pid file was left behind after a process had
died, the script would never know because we do:
kill -s ${1} ${APP_PID} || true
Currently, admin app requests service statistics (with notification
counts grouped by status) and template statistics (with counts by
template) in order to display the service dashboard.
Service statistics are gathered from FactNotificationStatus table
(counts for the last 7 days) combined with Notification (counts for
today).
Template statistics are currently gathered from redis cache, which
contains a separate counter per template per day. It's hard for us
to maintain consistency between redis and DB counts. Currently it
doesn't update the count for cancelled letters, counter resets in
the middle of the day might produce a wrong result for the rest of
the week and cleared redis cache can't be repopulated for services
with low data retention periods).
Since FactNotificationStatus already contains separate counts for
each template_id we can use the existing logic with some additional
filters to get separate counts for each template and status combination,
which would allow us to populate the service dashboard page from one
query response.
If a precompiled letter can't be opened (e.g. because it isn't a valid
PDF) we were setting its billable units to 0, but not moving it to the
invalid PDF bucket. If a precompiled letter failed sanitisation, we were
moving it to the invalid PDF bucket but not setting its billable units
to 0.
This commit makes sure that we always set the billable units to 0
and move the PDF to the right bucket if it fails sanitisation or can't be
opened.