Add Redis cache between admin and API

Most of the time spent by the admin app to generate a page is spent
waiting for the API. This is slow for three reasons:

1. Talking to the API means going out to the internet, then through
   nginx, the Flask app, SQLAlchemy, down to the database, and then
   serialising the result to JSON and making it into a HTTP response
2. Each call to the API is synchronous, therefore if a page needs 3 API
   calls to render then the second API call won’t be made until the
   first has finished, and the third won’t start until the second has
   finished
3. Every request for a service page in the admin app makes a minimum
   of two requests to the API (`GET /service/…` and `GET /user/…`)

Hitting the database will always be the slowest part of an app like
Notify. But this slowness is exacerbated by 2. and 3. Conversely every
speedup made to 1. is multiplied by 2. and 3.

So this pull request aims to make 1. a _lot_ faster by taking nginx,
Flask, SQLAlchemy and the database out of the equation. It replaces them
with Redis, which as an in-memory key/value store is a lot faster than
Postgres. There is still the overhead of going across the network to
talk to Redis, but the net improvement is vast.

This commit only caches the `GET /service` response, but is written in
such a way that we can easily expand to caching other responses down the
line.

The tradeoff here is that our code is more complex, and we risk
introducing edge cases where a cache becomes stale. The mitigations
against this are:
- invalidating all caches after 24h so a stale cache doesn’t remain
  around indefinitely
- being careful when we add new stuff to the service response

---

Some indicative numbers, based on:
- `GET http://localhost:6012/services/<service_id>/template/<template_id>`
- with the admin app running locally
- talking to Redis running locally
- also talking to the API running locally, itself talking to a local
  Postgres instance
- times measured with Chrome web inspector, average of 10 requests

╲ | No cache | Cache service | Cache service and user | Cache service, user and template
-- | -- | -- | -- | --
**Request time** | 136ms | 97ms | 73ms | 37ms
**Improvement** | 0% | 41% | 88% | 265%

---

Estimates of how much storage this requires:

- Services: 1,942 on production × 2kb = 4Mb
- Users: 4,534 on production × 2kb = 9Mb
- Templates: 7,079 on production × 4kb = 28Mb
This commit is contained in:
Chris Hill-Scott
2018-04-06 13:37:49 +01:00
parent 7d6d15f07b
commit 24dbe7b7b1
11 changed files with 201 additions and 8 deletions

View File

@@ -1,6 +1,6 @@
from notifications_python_client.errors import HTTPError
from app.notify_client import NotifyAdminAPIClient
from app.notify_client import NotifyAdminAPIClient, cache
from app.notify_client.models import (
User,
roles,
@@ -147,6 +147,7 @@ class UserApiClient(NotifyAdminAPIClient):
resp = self.get(endpoint)
return [User(data) for data in resp['data']]
@cache.delete('service')
def add_user_to_service(self, service_id, user_id, permissions):
# permissions passed in are the combined admin roles, not db permissions
endpoint = '/service/{}/users/{}'.format(service_id, user_id)