Accepting an invite changes:
- the `user_to_service` list of users returned by `GET /service/<id>`
- the `services` list return by `GET /user/<id>`
The latter change is causing the functional tests to fail.
In the same way, and for the same reasons that we’re caching the service
object.
Here’s a sample of the data returned by the API – so we should make sure
that any changes to this data invalidate the cache.
If we ever change a user’s phone number (for example) directly in the
database, then we will need to invalidate this cache manually.
```python
{
'data':{
'organisations':[
'4c707b81-4c6d-4d33-9376-17f0de6e0405'
],
'logged_in_at':'2018-04-10T11:41:03.781990Z',
'id':'2c45486e-177e-40b8-997d-5f4f81a461ca',
'email_address':'test@example.gov.uk',
'platform_admin':False,
'password_changed_at':'2018-01-01 10:10:10.100000',
'permissions':{
'42a9d4f2-1444-4e22-9133-52d9e406213f':[
'manage_api_keys',
'send_letters',
'manage_users',
'manage_templates',
'view_activity',
'send_texts',
'send_emails',
'manage_settings'
],
'a928eef8-0f25-41ca-b480-0447f29b2c20':[
'manage_users',
'manage_templates',
'manage_settings',
'send_texts',
'send_emails',
'send_letters',
'manage_api_keys',
'view_activity'
],
},
'state':'active',
'mobile_number':'07700900123',
'failed_login_count':0,
'name':'Example',
'services':[
'6078a8c0-52f5-4c4f-b724-d7d1ff2d3884',
'6afe3c1c-7fda-4d8d-aa8d-769c4bdf7803',
],
'current_session_id':'fea2ade1-db0a-4c90-93e7-c64a877ce83e',
'auth_type':'sms_auth'
}
}
```
Most of the time spent by the admin app to generate a page is spent
waiting for the API. This is slow for three reasons:
1. Talking to the API means going out to the internet, then through
nginx, the Flask app, SQLAlchemy, down to the database, and then
serialising the result to JSON and making it into a HTTP response
2. Each call to the API is synchronous, therefore if a page needs 3 API
calls to render then the second API call won’t be made until the
first has finished, and the third won’t start until the second has
finished
3. Every request for a service page in the admin app makes a minimum
of two requests to the API (`GET /service/…` and `GET /user/…`)
Hitting the database will always be the slowest part of an app like
Notify. But this slowness is exacerbated by 2. and 3. Conversely every
speedup made to 1. is multiplied by 2. and 3.
So this pull request aims to make 1. a _lot_ faster by taking nginx,
Flask, SQLAlchemy and the database out of the equation. It replaces them
with Redis, which as an in-memory key/value store is a lot faster than
Postgres. There is still the overhead of going across the network to
talk to Redis, but the net improvement is vast.
This commit only caches the `GET /service` response, but is written in
such a way that we can easily expand to caching other responses down the
line.
The tradeoff here is that our code is more complex, and we risk
introducing edge cases where a cache becomes stale. The mitigations
against this are:
- invalidating all caches after 24h so a stale cache doesn’t remain
around indefinitely
- being careful when we add new stuff to the service response
---
Some indicative numbers, based on:
- `GET http://localhost:6012/services/<service_id>/template/<template_id>`
- with the admin app running locally
- talking to Redis running locally
- also talking to the API running locally, itself talking to a local
Postgres instance
- times measured with Chrome web inspector, average of 10 requests
╲ | No cache | Cache service | Cache service and user | Cache service, user and template
-- | -- | -- | -- | --
**Request time** | 136ms | 97ms | 73ms | 37ms
**Improvement** | 0% | 41% | 88% | 265%
---
Estimates of how much storage this requires:
- Services: 1,942 on production × 2kb = 4Mb
- Users: 4,534 on production × 2kb = 9Mb
- Templates: 7,079 on production × 4kb = 28Mb
* Juror Central Summoning Bureau - Online – HM Courts & Tribunals Service
* Richmond and Wandsworth Council – Richmond and Wandsworth Council
* ESFA feedback – Department for Education
* PINS Digital Service – Planning Inspectorate
* HMPO Belfast – Home Office
* food.gov.uk – Food Standards Agency
* NCC Mobile Team – Newcastle City Council
* Cambs CC Emergency planning – Cambridgeshire County Council
* HMPO Durham – Home Office
* HMPO London – Home Office
* HMPO Glasgow – Home Office
* HMPO Peterborough – Home Office
* HMPO Southport – Home Office
* Student Loans Company. Password Reset. – Student Loans Company
* DigITS – Crown Commercial Service
* Ofsted Reports Beta – Ofsted
* Revenue Accounts – Ministry of Housing, Communities & Local Government
* Luton Council - Electoral Registration – Luton Council
* Cael eich Pensiwn y Wladwriaeth – Department for Work and Pensions
* Warwick Crown Court – Ministry of Justice
* ProLive - Housing Repairs – Pembrokeshire County Council
* Wirral Council – Wirral Council
* DfT Bus Open Data – Department for Transport
* Paris File Watcher – Pembrokeshire County Council
* WFDC Electoral Services – Wyre Forest District Council
Adding a ‘testing’ template it not enough. It needs to have some real
looking content, so that we can:
- work out what a service is doing
- assess whether that’s a reasonable (ie meeting the terms of use) thing
to be doing with Notify
At the moment we’re having to go back to services quite a lot when they
request to go live and ask them for this stuff.
The check page expects template ID to be passed through in the URL not
the session now. The send test letter page wasn’t changed.
This commit changes it, and adds a test to make sure this path is
covered.
The start job endpoint needs the template ID in order to make the API
call.
It doesn’t make sense to add it to the start job URL, because users
could potentially start a job with the wrong template by hacking the URL
(which would blow up at some point, if the template didn’t match the
columns in the file).
A of this commit’s parent we are storing `template_id` and
`original_file_name` in the URL. Getting them from the URL is better,
so the check page no longer needs to look for them in the session. This
commit removes the code that looks for these values in the session.
At the moment you can’t press refresh on the check page if there’s
errors. This is because the session gets cleared when there’s errors.
This is a bad user experience.
The data that this page is relying on (from the session) is:
- template ID
- original file name
Neither of these things need to be in the session because:
- they are not secret
- the user can modify them already (by choosing a different template or
renaming their file locally)
So this commit additionally stores them in the URL.
if a user signs in again, clear their file upload data from any
aborted journeys from before, so that their cookies don't fill up
also add some temporary logging when the session starts getting full.
Because we now[1] store info about each file upload separately in the
session the session isn’t overridden every time you upload a file. This
is good because you can do multiple file uploads idempotently.
Generally we are cleaning up after ourselves because we pop anything to
do with that upload from the session. However there is an edge case: if
you never send the file then the info about the file stays in the
session in perpetuity[2]. This is generally happening when people are
uploading files that are impossible to send, ie ones that have errors.
So this commit makes two changes:
1. remove info about a file upload from the session as soon as we know
that it contains errors
2. `POST` reuploads to the same endpoint as initial uploads because
otherwise we need to keep info about bad uploads in the session,
which would prevent us from doing 1.
1. https://github.com/alphagov/notifications-admin/pull/1968
2. or at least until the session is cleared by the user logging out