this is so that the filtering, which we do on the admin side, is applied
before pagination - so that the pages returned are all valid displayable
jobs. unfortunately this means that another config value has to be copied
to the server side but it's not the end of the world
this means normal celery workers WON'T process this queue. Requires a dedicated celery worker to do this.
- note development and test configs add it in so DEV and TEST builds require no change.
Pull requests as was would mean that we didn't read from the old queues, so whilst the new code writes to the new "notify" queue, the old code would write to the old plethera of queues. And the readers may not pick it up depending on the order of the deploy.
Safer to leave readers reading from all queues until after the deploy then doing a second deploy to tidy up the config.
- the two new queues that handle delivery of notifications (db-[type] / send-[type]) Now no longer are processed by the default workers
- These workers now to the admin type queues for notify (CSV/Validation codes) etc.
- Two new workers are deployed to the AWS environments, one focused on db- tasks and one for send tasks
- These will pick up those queues explicitly.
Previously there were 4 queues for sending messages
The was based on the fact that each notification has 2 actions - persist in the database and send to provider.
Two queues supported the CSV upload - for the first of these tasks
- bulk-email
- build-sms
And there were two more queues for the tasks that make the 3rd party client calls.
- sms
- email
API Calls just used the latter two queues for both tasks
Added four new queues
- db-email
- db-sms
- send-sms
- send-email
So an API call puts a notification into the db-[type] queue first, which then puts the notification into the send-[type] queue
Build queues stay as before.
This will allow us to target processing of these tasks with separate workers to manage these differently.
- runs 1 min past the hour, every hour
- looks up all scheduled jobs that have a scheduled date in the past and adds them to the normal process job queue
- these are then processed as normal
* sorted list in README and environment_test.sh
* removed some unused vars
* cleaned up some names to be more accurate in the readme
* removed twilio as a dependency
This allows us to prefix metrics with the environment to allow stats from staging and live to go to the same statsd, and alls us to filter in the dashboard by environment.
Removed all existing statsd logging and replaced with:
- statsd decorator. Infers the stat name from the decorated function call. Delegates statsd call to statsd client. Calls incr and timing for each decorated method. This is applied to all tasks and all dao methods that touch the notifications/notification_history tables
- statsd client changed to prefix all stats with "notification.api."
- Relies on https://github.com/alphagov/notifications-utils/pull/61 for request logging. Once integrated we pass the statsd client to the logger, allowing us to statsd all API calls. This passes in the start time and the method to be called (NOT the url) onto the global flask object. We then construct statsd counters and timers in the following way
notifications.api.POST.notifications.send_notification.200
This should allow us to aggregate to the level of
- API or ADMIN
- POST or GET etc
- modules
- methods
- status codes
Finally we count the callbacks received from 3rd parties to mapped status.
10 seconds, 1 minute, 5 minutes, 1 hour and 4 hours.
Total elapsed wait is max 5 hours 6 minutes and 10 seconds.
Changed visibility window of SQS to be 4 hours 10 seconds, longer the max retry period.