We don't want pyup.io upgrading sub-dependencies listed in the
requirements.txt file since it does it whenever a new version is
available regardless of what our application dependencies require.
The list of top-level dependencies is moved to requirements-app.txt,
which is used by `make freeze-requirements` to generate the full
list of requirements in requirements.txt.
(See alphagov/notifications-api#1938 for details.)
Tests fail with `wtforms==2.2.1`. We're not sure of the reason but on
production this version is used and locally it's not, because we only
require flask-wtforms, which doesn't pin its requirements at all. We
should probably pin all requirements from jenkins onwards to prevent
this kind of thing happening again.
If a template has a placeholder like `((email address))` then the sample
spreadsheet and CSV file have the email column twice.
Trying to upload this spreadsheet will result in a ‘duplicate column’
error.
This commit fixes it so that the column will only appear once.
Updated notifications-utils. This brings in
- the renamed character sanitization classes
- the change to allow unicode in letter addresses (this lets us delete
a test that is no longer relevant)
Also replaced non-ascii characters in headers. This fixes a bug where
non-ascii characters in a CSV filename were causing errors since the
filename is also used in the header.
Most of the time spent by the admin app to generate a page is spent
waiting for the API. This is slow for three reasons:
1. Talking to the API means going out to the internet, then through
nginx, the Flask app, SQLAlchemy, down to the database, and then
serialising the result to JSON and making it into a HTTP response
2. Each call to the API is synchronous, therefore if a page needs 3 API
calls to render then the second API call won’t be made until the
first has finished, and the third won’t start until the second has
finished
3. Every request for a service page in the admin app makes a minimum
of two requests to the API (`GET /service/…` and `GET /user/…`)
Hitting the database will always be the slowest part of an app like
Notify. But this slowness is exacerbated by 2. and 3. Conversely every
speedup made to 1. is multiplied by 2. and 3.
So this pull request aims to make 1. a _lot_ faster by taking nginx,
Flask, SQLAlchemy and the database out of the equation. It replaces them
with Redis, which as an in-memory key/value store is a lot faster than
Postgres. There is still the overhead of going across the network to
talk to Redis, but the net improvement is vast.
This commit only caches the `GET /service` response, but is written in
such a way that we can easily expand to caching other responses down the
line.
The tradeoff here is that our code is more complex, and we risk
introducing edge cases where a cache becomes stale. The mitigations
against this are:
- invalidating all caches after 24h so a stale cache doesn’t remain
around indefinitely
- being careful when we add new stuff to the service response
---
Some indicative numbers, based on:
- `GET http://localhost:6012/services/<service_id>/template/<template_id>`
- with the admin app running locally
- talking to Redis running locally
- also talking to the API running locally, itself talking to a local
Postgres instance
- times measured with Chrome web inspector, average of 10 requests
╲ | No cache | Cache service | Cache service and user | Cache service, user and template
-- | -- | -- | -- | --
**Request time** | 136ms | 97ms | 73ms | 37ms
**Improvement** | 0% | 41% | 88% | 265%
---
Estimates of how much storage this requires:
- Services: 1,942 on production × 2kb = 4Mb
- Users: 4,534 on production × 2kb = 9Mb
- Templates: 7,079 on production × 4kb = 28Mb
If someone has duplicate recipient columns in their file we don’t know
which one to use. This commit adds an error message which should help
them fix the duplication.
This commit doesn’t go to the extra effort to actually show the
correct values for duplication in the preview. Don’t think it’s worth
the effort/complexity for how infrequently we’ve seen this error.
Depends on:
- [ ] https://github.com/alphagov/notifications-utils/pull/376
When downloading a report of a which messages from a job have been
delivered and which have failed we currently only include the Notify
data. This makes it hard to reconcile or do analysis on these reports,
because often the thing that people want to reconcile on is in the data
they’ve uploaded (eg a reference number).
Here’s an example of a user talking about this problem:
> It would also be helpful if the format of the delivery and failure
> reports could include the fields from the recipient's file. While I
> can, of course, cross-reference one report with the other it would be
> easier if I did not have to. We send emails to individuals within
> organisations and it is not always easy to establish the organisation
> from a recipient's email address. This is particularly important when
> emails fail to be delivered as we need to contact the organisation to
> establish a new contact.
– ticket 677
We’ve also seen it when doing research with a local council.
This commit takes the original file, the data from the API, and munges
them together.