This changeset pulls in all of the notification_utils code directly into the API and removes it as an external dependency. We are doing this to cut down on operational maintenance of the project and will begin removing parts of it no longer needed for the API.
Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>
This deletes a big ol' chunk of code related to letters. It's not everything—there are still a few things that might be tied to sms/email—but it's the the heart of letters function. SMS and email function should be untouched by this.
Areas affected:
- Things obviously about letters
- PDF tasks, used for precompiling letters
- Virus scanning, used for those PDFs
- FTP, used to send letters to the printer
- Postage stuff
Just looks a bit tidier and less repetitive.
I’ve only done this for the serialised service because:
- we’re only checking this in places where we’re already using the
serialised service
- if we want to check this elsewhere there’s a good chance that new code
should be using the serialised service, since it’ll itself be doing
some kind of performance optimisation
Add caching by using the SeriralisedTemplate and SerialisedService objects
Removed extra call to the database to fetch the notification after the commit by saving the created_at and key_type to a local variable. After the update to the notification to mark it as sending the db.session is committed. Any reference to the the Notification data model after that will require a query to fetch the object again because it is considered "dirty" or out of date.
Added name, sms_prefix and email branding to SerialisedService.
Refactor the get_html_options to work with the SerialisedService object.
Removed the need to validate and format the to field by using `normalised_to`, since when persisting the notification the `normalised_to` field has already had this done.
Removed the validate and format for reply_to_text for email reply_to, this has been done when the email address has been added via the frontend, no need to validate this address every time a services sends an email.
These tasks need to repeatedly get the same template and service from
the database. We should be able to improve their performance by getting
the template and service from the cache instead, like we do in the REST
endpoint code.
ALLOWED_PROPERTIES is a list containing fields that it will attempt to
deserialize into the returned object. If a field isn't present on the
underlying data source (whether that's a dump from a marshmallow schema,
a blob retrieved from redis, or anything else), then SerialisedModel
will raise an exception. However, SerialisedModel isn't involved when
setting the cache, so with that said the correct flow for adding a
column to a cached database model:
PR #1
* add new column to DB
* add new column to model, and to template_schema (because that schema
is used to create the dict that goes in to redis). New redis keys
start getting populated.
Deploy that through, and then clear redis
PR #2
* add new column to SerialisedTemplate.ALLOWED_PROPERTIES. this means
that it'll start reading that value from redis
* now instances of the app will always have the new field in their
template objects, whether they came from database directly or from
the cache
This commit removes the field from ALLOWED_PROPERTIES. After it's
deployed we'll be able to clear redis, observe redis being populated
with the new field, and then we'll be able to re-add it to
ALLOWED_PROPERTIES, ready to use.
had to go through the code and change a few places where we filter on
template types. i specifically didn't worry about jobs or notifications.
Also, add braodcast_data - a json column that might contain arbitrary
broadcast data that we'll figure out as we go. We don't know what it'll
look like, but it should be returned by the API
The API needs these to check whether a service can send a notification.
This commit also updates all the tests in `test_validators.py` to take
a serialised service, not a database object.
We changed auth.py to import from app.serialised_models here:
https://github.com/alphagov/notifications-api/pull/2887/files#diff-77cbb1e03185c7319f0311371c438b0cR11
`serialised_models.py` imports from `templates_dao.py`
`templates_dao.py` imports from `users_dao.py`
`users_dao.py` imports from `errors.py`
`errors.py` imports from `auth.py` … and the circle is complete 💥
For some reason this caused the Celery workers to crash on startup, but
not the app. Which I guess is why the integration tests didn’t catch
this?
Same as we’re doing for templates.
This means avoiding a database call, even for services that don’t hit
our API so often.
They’ll still need to go to the database for the API keys, because we’re
not comfortable putting the API key secrets in Redis.
But once a service has got its keys from the database we commit the
transaction, so the connection can be freed up until we need it again to
insert the notification.
Same as we’ve done for templates.
For high volume services this should mean avoiding calls to external
services, either the database or Redis.
TTL is set to 2 seconds, so that’s the maximum time it will take for
revoking an API key or renaming a service to propagate.
Some of the tests created services with the same service ID. This
caused intermittent failures because the cache relies on unique service
IDs (like we have in the real world) to key itself.
We think that holding open database transactions while we go and do
something else is causing us to have poor performance.
Because we’re not serialising everything as soon as we pull it out of
the database we can guarantee that we don’t need to go back to the
database again.
So let’s see if explicitly closing the transaction helps with
performance.
By serialising these straight away we can:
- not go back to the database later, potentially closing the connection
sooner
- potentially cache the serialised data, meaning we don’t touch the
database at all
For some reason our V1 get template response wraps the whole template in
a dictionary with one key, `'data'`:
0d99033889/app/template/rest.py (L166)
That means when the admin app caches the response it also caches it in
this format.
The API needs to do the same, otherwise it will be cacheing data with a
schema that the admin app isn’t expecting, and vice-versa.