it's almost entirely duplicated so share it across.
also clean up retrying - `task.retry(...)` raises a
celery.exceptions.Retry object, so you do not need to `raise` its
response. additionally, cleaned up tests around that since raising
Exception and asserting Exception is raised is dangerous as it could
mask actual programming errors
it now switches utils.template.Template type, since the base Template
type now no longer has a subject attribute.
updated test case to use `sample_email_template_with_placeholders`
instead of `sample_email_template`
- note this is an unexpectedly big change.
- When we create a service we pass the service id to the persist method. This means that we don't have the service available to check if in research mode.
- All calling methods (expecting the one where we use the notify service) have the service available. So rather than reload it I changed the method signature to pass the service, not the ID to persist.
- Touches a few places.
Note this means that the update or create methods will fall over on a null service. But this seems correct.
Goes back to the story which we need to play to make the service available as the API user so that the need to load and pass around services is minimised.
In this PR the id for the notification is passed in and used to created the notification, which causes a integrity error.
Normally when we get a SQLAlchemy error here we send the message to the retry queue, but if the notification already exists
we just ignore it.
1) It's incr not inc on the redis client, so renamed the calls everywhere
2) Redis returns bytes/string rather than an int if the value stored is an int. Cast the result to an int before use. Not you can set up the GET to do this transparently but I've not done this as we *may * use GETS for non-int and the callback sets up the cast for the connection not the call.
After we have written to the database and placed it on a deliver queue we count it in the cache against the service.
This is the equivalent of doing it at the end of the API call.
From a support ticket:
> it's possible to add a personalisation token with trailing whitespace
> (eg. "key " rather than "key"). Can this be trimmed in the UI to guard
> against this? (one of our devs copied and pasted it from a document
> and inadvertently included the space)
> Nothing major but caused a few hours of investigations!
Rather than trim the placeholder in the template, we should treat
placeholders in API calls the same way we do with CSV files, ie we
ignore case and spacing in the name of the placeholder. So
`(( First Name))` is equivalent to `((first_name))`, and both would be
populated with a dictionary like `{'firstName': 'Chris'}`.
Depends on:
- [x] https://github.com/alphagov/notifications-utils/pull/77
help prevent issues where scheduled jobs are processed twice. note this is NOT
a watertight solution - it holds no locks, and there is no guarantee that the
status won't have updated between asserting that its status is 'pending' and
updating it to be 'in progress'
- It seems that when we changed the name of the job.status column that we didn't update the code to use job.job_status.
- Therefore none of the jobs since then have had the job status updated.
- Now that this is fix we can show the job status when there is an error like "sending exceeds limits"
- This could happen if a job is scheduled to run at the top of the hour, so at the time of the job creation the limit was not exceed, but at the time of processing the job the limit is exceed.
Refactored send_notifications method so that it is more readible.
Refectored the test_send_notificaitons so that it uses parametrized test to avoid duplication.
Previously there were 4 queues for sending messages
The was based on the fact that each notification has 2 actions - persist in the database and send to provider.
Two queues supported the CSV upload - for the first of these tasks
- bulk-email
- build-sms
And there were two more queues for the tasks that make the 3rd party client calls.
- sms
- email
API Calls just used the latter two queues for both tasks
Added four new queues
- db-email
- db-sms
- send-sms
- send-email
So an API call puts a notification into the db-[type] queue first, which then puts the notification into the send-[type] queue
Build queues stay as before.
This will allow us to target processing of these tasks with separate workers to manage these differently.
- As before this is now driven from the notifications history table
- Removed from updates and create
- Signatures changes to removed unused params hits many files
- Also potential issue around rate limiting - we used to get the number sent per day from the stats table - which was a single row lookup, now we have to count this. This applies to EVERY API CALL. Probably not a good thing and should be addressed urgently.
- "RETRY" prefixes the messages
In event of the retry attempts completing without successfully completing the task identify message as such
- "RETRY FAILED" prefixes the messages
Applies to the send_sms|send_email and send_sms_to_provider|send_email_to_provider tasks
These are there to try and ensure we can alert on these events so that we know if we have started retrying messages
Retry messages also contain notification ids to aid debugging.
Removed all existing statsd logging and replaced with:
- statsd decorator. Infers the stat name from the decorated function call. Delegates statsd call to statsd client. Calls incr and timing for each decorated method. This is applied to all tasks and all dao methods that touch the notifications/notification_history tables
- statsd client changed to prefix all stats with "notification.api."
- Relies on https://github.com/alphagov/notifications-utils/pull/61 for request logging. Once integrated we pass the statsd client to the logger, allowing us to statsd all API calls. This passes in the start time and the method to be called (NOT the url) onto the global flask object. We then construct statsd counters and timers in the following way
notifications.api.POST.notifications.send_notification.200
This should allow us to aggregate to the level of
- API or ADMIN
- POST or GET etc
- modules
- methods
- status codes
Finally we count the callbacks received from 3rd parties to mapped status.