We store our audit history in two ways:
1. A list of versions of a service
2. A list of events to do with API keys
In the future there could be auditing data which we want to display that
is stored in other formats (for example the event table).
This commit adds some objects which wrap around the different types of
auditing data, and expose a consistent interface to them. This
architecture will let us:
- write clean code in the presentation layer to display these events on
a page
- add more types of events in the future by subclassing the `Event` data
type, without having to rewrite anything in the presentation layer
If we change our mind and decide whether a service should/should not be
counted in the list of live services then we should also drop the cache
which stores the count of how many live services there are.
Returns the data calculated by the API. Stored in Redis against a
hardcoded key so that no-one hammering the home page is directly hitting
the database.
Adds a front end for:
https://github.com/alphagov/notifications-api/pull/2417
> Sometimes we have to make a few services for what really is one
> service, for example GOV.UK Pay and GOV.UK Pay Direct Debit. We also
> have our own test services which aren’t included in the count of live
> services. We currently count these as one service by not including
> them in the beta partners spreadsheet.
We get a bunch of requests to go live where people have told us they're
going to send email but there is no email reply-to address present.
These come from 2 scenarios:
1. when there are email templates, and no reply to address – but they
ignore the checklist
2. when there are no email templates (yet) but they provide anticipated
volumes for email
At the moment we only auto-check for a reply to address when they have
email templates. And because the question about anticipated volumes
follows the checklist, you'll get a checklist that passes (reply
addresses not required as no templates present) - but your future intent
that differs (reply address IS required because you have anticipated
volumes).
So let’s bring the request for anticipated volumes into the checklist,
that way we can dynamically add the requirement for a reply to address
if they say they will send email but don't have templates yet.
We should begin storing it in the database against the service to stop
people having to re-enter it each time they try to complete the go live
screens.
This also means moving the ‘consent to research question’ along with
the questions about volume, because
- we want people to answer both before going live
- we don’t want to clutter up the summary page by asking questions there
too
new code is copied stylistically from the email branding patterns.
Instead of `service.dvla_organisation`, there's now
`service.letter_branding` and `service.letter_branding_id`. However,
unlike email branding we're not currently showing a preview of the
logo. That can come later when we work out how we want to do it.
This removes some code which is duplicative and obscure (ie it’s not
very clear why we do `"a" * 73` even though there is a Very Good Reason
for doing so).
Adds caching for service data retention. This removes separate API
client methods to retrieve individual data retention records by id
or type in favor of a single method that fetches and caches all
retention settings configured for the service. This makes it much
easier to invalidate cache when settings change.
Lookup by id or type is provided by helper methods in the service
model.
The add new templates page now has option to add template folders.
Tweaked wording of other options and h1 to clarify options since it's
not all about templates any more.
Added api client and stuff for it
Added a new row to the settings table, 'Post class', which shows the
default letter class of a service and is only visible to Platform Admin.
Also added a new page to enable Platform Admin users to change the
default letter class for a service - this only has two options at the
moment, 1st class only and 2nd class only.
Allows getting notification counts for a given number of days to
support services with custom data retention periods (admin
dashboard page should still display counts for the last 7 days,
while the notifications page displays all stored notifications).
We had kept the original platform-admin page at `/platform-admin` and
created a new page, `/platform-admin-new` for the new platform admin
page. Now that the numbers on both pages look ok we no longer need both
pages, so can replace the original page.
We’ve had a user who’s said:
> Seems configured callbacks cannot be removed once they’re set as the
> fields have a presence check. Is that intentional?
This means it’s not working as they expect. Rather than have to go and
change stuff in the database for them, let’s make it work as they’d
expect.
Only lets you clear the form if you remove both the token and the URL.
Added a page which lets users with the 'manage_service' permission change the
contact link for their service. There are no links to this page yet
since only services using document download will need to set a contact
link.
we're not actually looking at the detailed service aspects - just
the stats. We're doing this in three places:
* dashboard
* notification activity page
* when checking jobs to see if we're over the daily limit
change these places to use a new api endpoint (service/id/statistics),
which hopefully be a little more performant, and will definitely be a
little more organised - moving away from generic endpoints with loads
of optional parameters.
We still need the detailed endpoints for the platform admin page tho.
Depends on https://github.com/alphagov/notifications-api/pull/1865
For both SMS senders and email reply to addresses this commit adds:
- a delete link
- a confirmation loop
It doesn’t let users delete:
- default SMS senders or reply to addresses (they always have to have
one)
- inbound numbers
It assumes that the API will allow updating of an attribute named
`active` on the respective database rows. It could work in a different
way. We can’t do complete deletion though because these will still be
keyed to notifications.
This is easier to read than having to understand the arguments 1…n of
the cache decorator are ‘magic’, and gives us more flexibility about
how the cache keys are formatted, eg being able to add words in the
middle of them.
Also changes the key format for all templates to be
`service-{service_id}-templates` instead of `templates-{service_id}`
because then it’s clearer what the ID represents.
A lot of the frequently-used pages in the admin app rely on the API to
get templates.
So this commit adds three new caches:
- a single template version (including a key without a version number,
which is the current version)
- all the templates for a service
- all versions of a template
The first will be the most crucial for performance, but there’s not much
cost to adding the other two.
`@cache.delete('user', 'user_id')` is easier to read and understand than
`@cache.delete('user', key_from_args=[1])`. This will become even more
apparent if we have to start doing stuff like `key_from_args=[1, 5]`,
which is a lot more opaque than just saying
`'service_id', 'template_id'`.
It does make the implementation a bit more complex, but I’m not too
worried about that because:
- the tests are solid
- it’s nicely encapsulated
This `*params` argument seems to be copy/pasted boilerplate. It’s not
used by any consumers of this client, and makes it harder to write a
decorator for this function.
In the same way, and for the same reasons that we’re caching the service
object.
Here’s a sample of the data returned by the API – so we should make sure
that any changes to this data invalidate the cache.
If we ever change a user’s phone number (for example) directly in the
database, then we will need to invalidate this cache manually.
```python
{
'data':{
'organisations':[
'4c707b81-4c6d-4d33-9376-17f0de6e0405'
],
'logged_in_at':'2018-04-10T11:41:03.781990Z',
'id':'2c45486e-177e-40b8-997d-5f4f81a461ca',
'email_address':'test@example.gov.uk',
'platform_admin':False,
'password_changed_at':'2018-01-01 10:10:10.100000',
'permissions':{
'42a9d4f2-1444-4e22-9133-52d9e406213f':[
'manage_api_keys',
'send_letters',
'manage_users',
'manage_templates',
'view_activity',
'send_texts',
'send_emails',
'manage_settings'
],
'a928eef8-0f25-41ca-b480-0447f29b2c20':[
'manage_users',
'manage_templates',
'manage_settings',
'send_texts',
'send_emails',
'send_letters',
'manage_api_keys',
'view_activity'
],
},
'state':'active',
'mobile_number':'07700900123',
'failed_login_count':0,
'name':'Example',
'services':[
'6078a8c0-52f5-4c4f-b724-d7d1ff2d3884',
'6afe3c1c-7fda-4d8d-aa8d-769c4bdf7803',
],
'current_session_id':'fea2ade1-db0a-4c90-93e7-c64a877ce83e',
'auth_type':'sms_auth'
}
}
```
Most of the time spent by the admin app to generate a page is spent
waiting for the API. This is slow for three reasons:
1. Talking to the API means going out to the internet, then through
nginx, the Flask app, SQLAlchemy, down to the database, and then
serialising the result to JSON and making it into a HTTP response
2. Each call to the API is synchronous, therefore if a page needs 3 API
calls to render then the second API call won’t be made until the
first has finished, and the third won’t start until the second has
finished
3. Every request for a service page in the admin app makes a minimum
of two requests to the API (`GET /service/…` and `GET /user/…`)
Hitting the database will always be the slowest part of an app like
Notify. But this slowness is exacerbated by 2. and 3. Conversely every
speedup made to 1. is multiplied by 2. and 3.
So this pull request aims to make 1. a _lot_ faster by taking nginx,
Flask, SQLAlchemy and the database out of the equation. It replaces them
with Redis, which as an in-memory key/value store is a lot faster than
Postgres. There is still the overhead of going across the network to
talk to Redis, but the net improvement is vast.
This commit only caches the `GET /service` response, but is written in
such a way that we can easily expand to caching other responses down the
line.
The tradeoff here is that our code is more complex, and we risk
introducing edge cases where a cache becomes stale. The mitigations
against this are:
- invalidating all caches after 24h so a stale cache doesn’t remain
around indefinitely
- being careful when we add new stuff to the service response
---
Some indicative numbers, based on:
- `GET http://localhost:6012/services/<service_id>/template/<template_id>`
- with the admin app running locally
- talking to Redis running locally
- also talking to the API running locally, itself talking to a local
Postgres instance
- times measured with Chrome web inspector, average of 10 requests
╲ | No cache | Cache service | Cache service and user | Cache service, user and template
-- | -- | -- | -- | --
**Request time** | 136ms | 97ms | 73ms | 37ms
**Improvement** | 0% | 41% | 88% | 265%
---
Estimates of how much storage this requires:
- Services: 1,942 on production × 2kb = 4Mb
- Users: 4,534 on production × 2kb = 9Mb
- Templates: 7,079 on production × 4kb = 28Mb