Same as we’re doing for templates.
This means avoiding a database call, even for services that don’t hit
our API so often.
They’ll still need to go to the database for the API keys, because we’re
not comfortable putting the API key secrets in Redis.
But once a service has got its keys from the database we commit the
transaction, so the connection can be freed up until we need it again to
insert the notification.
Same as we’ve done for templates.
For high volume services this should mean avoiding calls to external
services, either the database or Redis.
TTL is set to 2 seconds, so that’s the maximum time it will take for
revoking an API key or renaming a service to propagate.
Some of the tests created services with the same service ID. This
caused intermittent failures because the cache relies on unique service
IDs (like we have in the real world) to key itself.
We think that holding open database transactions while we go and do
something else is causing us to have poor performance.
Because we’re not serialising everything as soon as we pull it out of
the database we can guarantee that we don’t need to go back to the
database again.
So let’s see if explicitly closing the transaction helps with
performance.
By serialising these straight away we can:
- not go back to the database later, potentially closing the connection
sooner
- potentially cache the serialised data, meaning we don’t touch the
database at all
For some reason our V1 get template response wraps the whole template in
a dictionary with one key, `'data'`:
0d99033889/app/template/rest.py (L166)
That means when the admin app caches the response it also caches it in
this format.
The API needs to do the same, otherwise it will be cacheing data with a
schema that the admin app isn’t expecting, and vice-versa.
We need to serialise the template to JSON to store it in Redis. Python’s
built in JSON serialiser doesn’t know what to do with a UUID object, so
we need to manually cast it to a string instead.
It’s a UUID column, but by default Marshmallow wants to select the id
from the users table, not from the templates table, because the two
are foreign-keyed.
Adding the property explicity like this forces it to select from the
`created_by_id` column, but still serialises it to the `created_by`
field to avoid any breaking change.
Content and subject are user-submitted so are effectively unbounded in
size. And we’re serialising them for every template when sending the
list of templates to the admin app.
For the service with the most templates this results in a 1.3Mb blob of
JSON going over the wire, and being cached in Redis.
And then the admin app completely ignores these fields, because it does
show template content until you’ve clicked into a single template.
This commit adds a new query parameter, `detailed`, that the admin app
can set to `False`. When it does only the fields needed to render the
`/templates` page are returned.
This is done with a new parameter so as not to break the V1 API.
Although I looked in Kibana and it doesn’t seem like anyone external is
using this endpoint we’ve come this far without breaking the API so…
This commit changes the code in post notification endpoint to handle a
serialised template (ie a `dict`) rather than a database object.
This is the first step towards being able to cache the template and not
hit the database on every request.
There should be no functional changes here, it’s just refactoring.
There are some changes to the tests where the signature of functions
has changed.
Importing of the template schema has to be done at a function level,
otherwise Marshmallow gets weird.
This commit also copies the `JSONModel` class from the admin app, which
turns serialised data (a dict made from JSON) into an object on which
certain predefined properties are allowed.
This means we can still do the caching of serialised data, without
having to change too much of the code in the app, or make it ugly by
sprinkling dict lookups everywhere.
We’re not copying all of JSONModel from the admin app, just the bits we
need. We don’t need to compare or hash these objects, they’re just used
for lookups. And redefining `__getattribute__` scares Leo.
We were checking this separately in two places in the code. Now
we will have this logic in one place, in validators.
Also pull in utils version that recognises crown depenency numbers
as international.
The constaints on notifications and notification_history have already
been dropped in production, but still exist in staging and in dev
environments.
The constraints on templates and templates_history exist in all
environments.
We did not have a JSON schema for updating a template. Since we will
remove the postage constraint from the templates table, this adds a JSON
schema for updating a template so that we can use it to check that the
postage is one of the allowed values.
see https://github.com/alphagov/notifications-utils/pull/752 for
implementation details. Cache DNS for 15 seconds. Note: This cache is
per eventlet, so concurrent requests will handle their requests
separately, but the cache does persist between sequential requests.
re-enable statsd for all environments.
In the previous PR I removed the `update_notification` method to reduce the need for another update query. However, that meant the notification was marked as delivered without an updated_at timestamp.
It is weird to set the updated_at when we create the notification. So is this a better fix? Or do I put the update back now?
I recommend we push this fix now.
We have seen bad latency across all apps in the last week. After running
a canary with statsd turned off we have seen these issues dissapear on
that canary instance. We therefore will turn off statsd for all
instances of the API. We will still need to investigate tomorrow what
exactly changed or is causing the issue with statsd as we hadn't made
any changes to it ourself.
Note, this keeps statsd on for all the other apps but this will be a
first step. We will also need to check performance of the other apps
after releasing this.