If the organisation table contains an entry for `agreement_signed_by_id`
and for `agreement_signed_on_behalf_of_name` then we should the person
who signed the MOU as being the `agreement_signed_on_behalf_of_name`.
This was wrongly showing the `agreement_signed_by_id` as the person who
signed the agreement.
We use the `ChangeEmailForm` if you want to change your own email
address or someone else's email address. This has various validators
which get run. We check if the email address is valid (by using a
function from utils) and if the email address is already in use
(by calling API).
If the email address is not valid, we should not call API to see if it's
already in use because this will cause an exception in API leading to a
`500` in admin. We now only call API if there were no other errors with
the email address.
(The `test_should_redirect_after_name_change` test didn't need the
`mock_email_is_not_already_in_use` fixture, so this has been removed.)
This brings a few performance improvements for RecipientCSV, which
we use to preview and process CSVs. One change also renames one of
the attributes for the class to "guestlist".
We don't currently have any radio fields or check boxes where it's
possible to get more than one validation error. However, since we
never want to show more than one error at a time for a field, this
changes the error messages for the relevant widgets to only show the
first error if there ever were multiple.
If there were multiple errors, this widget was joining the messages
together and displaying all error messages. If a text input field does
have more than one validation error, we only want to show one.
This adds a link to the user profile of the person who requested to go
live for "Request to go live" Zendesk tickets. Viewing a user's profile
page helps us to check for duplicate organisations and services from
that user.
We want organisation team members to be able to see the MOU details for
their organisation. This change creates a new page called billing, which
contains these details. It's only visible to platform admin users now -
the plan is to add more information to this page, then to make it visible
to all organisation users.
The page showing the MOU covers the case of when agreement_signed is
True, when an agreement_signed is False, and when agreement_signed is
None. The case when an agreement_signed is None is very rare - it
signifies that the agreement is not signed but that we have some
service-specific agreements in place. We only have a few organisations
in this state, so it's unlikely that the content for this scenario will
be seen.
When an organisation has signed the agreement we may know the full
details (signing date, version signed, the person who signed it or who it
was signed on behalf of), or we may only have the name of the person who
signed the agreement. We show the more detailed content if possible, and
a less detailed version of the content if not.
There's a new route for downloading the agreement which is almost
identical to the existing `.service_download_agreement` route (plus the
test is almost the same), except that it takes an organisation ID
instead of a service ID.
In
a9617d4df6
we upgraded the version of utils to 49.1 which brought in a renamed
`TTL` as `DEFAULT_TTL`.
However, not only did it change the name, it also changed its type
from an `int` to a `float`:
https://github.com/alphagov/notifications-utils/pull/923/files
We thought that would be OK as in the utils, we moved the conversion
to an integer to happen in the `set` method but it turns out that
caused an issue in the admin app where setting the `has_jobs...`
redis keys will error:
```
Redis error performing set on has_jobs-4bd11cb2-cc17-44e1-b241-8547990db245
...
...
redis.exceptions.ResponseError: value is not an integer or out of range
```
It looks like this is because we are passing a float instead of an
int to `ex`
See a similar post describing the importance of ints rather than
floats for other parameters:
https://developpaper.com/question/redis-err-value-is-not-an-integer-or-out-of-range/
An interesting note is our test
`test_client_creates_job_data_correctly` didn't catch this because
`float(604800) == int(604800`.
I've gone for the quickest solution which is to wrap `DEFAULT_TTL`
in an int. The reason I've done this now is that to do the long
term and more durable fix is to add this fix to utils, however
there are several breaking changes infront of it that would take
me a while to bring in to the admin app first. I've checked the
admin and API apps and this is the only place we are directly
using `DEFAULT_TTL`.
Makes the mock up of an alert we show use an
inline SVG instead of it as a background image.
This means it can use the colour of the heading
text next to it in a way that adapts when high
contrast mode is on.
https://github.com/alphagov/notifications-utils/pull/922
Making the icon an inline SVG lets it inherit
colours from the page styles. This helps in forced
colour modes, like Windows high contrast mode,
where it will match the colour of the text next to
it, whatever it is set to.
Making it inline requires some changes to the CSS
to allow its position to match that of the current
background image.
This also sets `forced-color-adjust` to `auto` on
the `<svg>` element, which tells the browser it
can control its colours in forced colour modes.
This is required because the browsers that support
forced colour mode set it to `none` for the
`<svg>` element by default.
This stat is shown on a few of our pages, such as our homepage,
the performance page and also a platform admin page
and is currently catched for the
default TTL of 1 week. I think there is no reason we can't make
this only cache once an hour and give slightly more up to date stats
which will update more regularly.
This mimics the approach and also the TTL choice of 1 hour that has
been added for the performance page (although there is no
particular bug here to fix, it is just nice to have slightly more up
up to date data).
Note, the API call only takes about 0.3 seconds at the moment
so it is not particularly intensive on the DB to run this more
regularly.
This is a quick additional check to protect the user:
- From getting a CloudFront 502 error if the file takes too
long to upload. I was surprised to find it takes about 1 minute
to upload a 70Mb file to S3.*
- From getting a CloudFront 502 error when we follow the redirect
and run through the slow processing code in utils that builds a
RecipientCSV [1].
For context, a CSV with 100K rows and a few columns is around 5Mb,
so a 10Mb limit should be enough. Analysis over the past week shows
that the vast majority of CSV uploads are actually < 2.5Mb.
I haven't added any tests for this because:
- The check isn't critical, as the worst case scenario is the user
gets a worse error than this in-app one.
- There's no easy way to mock the validation, and I didn't want to
have a test that depends on a 10Mb+ file.
*We're using "key.put" to upload the file, when we could be doing
a multipart upload [2]. However, I tried this myself with a chunk
size of 1000 bytes and found it only led to a marginal improvement.
[1]: https://github.com/alphagov/notifications-utils/pull/930
[2]: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html
When we combine simplified polygons from different areas we try to use
the polygons that are defined in metres as a speed optimisation.
This doesn’t work when those polygons are in several different
projections because the distance in metres is relative to the edge of
the projection. Just naively choosing one projection means that some
of the polygons will shift position on the earth by 100s of kilometers
once they are converted back to degrees.
To avoid this we need to:
- check for situations where we have polygons in multiple different
projections
- transforms these polygons back into degrees and let `Polygons` pick a
common projection for all of them before doing any
merging/smoothing/etc.
I came across a bug where the performance page might be visited
on say the 22nd and we expect it to show data for yesterday and
the past 7 days (so the 21st and back 7 days) but instead it
only shows it for the 20th and then back 6 days.
The cause is that we have nightly tasks that run to
calculate the number of notifications sent per service (the
ft_notification_status table)
calculate the number of notifications sent within 10 seconds
(the ft_processing_time table)
Both of these tables get filled between the hours of midnight and
2:30am. The first seems to be filled by about 12:30 every night
whereas the processing stats seems to be filled about 2am every
night.
The admin app currently caches the results of the call to the API
(https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L24)
to get this data with the key
performance-stats-{start_date}-to-{end_date}.
Now the issue here is that if the performance page is visited at
00:01 on the 23rd, it will call the API for data on the 22nd and
backwards 7 days. The data it gets at 00:01 will not yet have the
data for the 22nd, it will only go as far as the 21st, because
the overnight jobs to put data in the tables have not run yet. The
admin app then caches this data, meaning that for the rest of the
day the page will be missing data for the 22nd, even though it
becomes available after approx 2am. We have seen it a few times
in production where the performance page has indeed been visited
before say 2 am, leaving the page with missing data.
In terms of solutions, there are several potential options.
1. Remove the caching that we do in redis. The downside of this is
that we will hit the API more (currently once a day for the first
visitor to the performance page). We have had 255 requests
to the performance page in the last week and when there is no value
in redis, the call to the API takes approximately 1-2 seconds.
Removing the caching would against why the caching was originally
added which was to act as a basic protection against malicious
DOS attempts.
2. Make the admin app only set the cache value if it is after
2:30am. This one feels a bit gross and snowflakey.
3. The API flushes the redis cache after it has finished its nightly
tasks. We don't have that much precedent of the API interacting
with redis, it is mostly the admin app that does but this is still
a valid option.
4. Cache in redis the data for 2 days ago to 7 days ago and then
get the data for yesterday freshly every time. This would mean
some things are still cached but the smallest bit that we can't rely
on can be gathered fresh.
5. Remove the caching we do in redis but then try and optimise the
API endpoint to return results even faster and with less
interaction with the DB. My hypothesis is that the slowest part
is where the API has to calculate how many notifications Notify
has ever sent
(https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L36)
and so we could potentially try and store these totals in the DB
every night and update them nightly rather than having to query all
the data in the table to figure them out every time.
6. Reduce the expiry time of the cache key so although the bug will
still exist, it will only exist for a much shorter time.
In the end, we've gone for option 6. The user impact of
this bug sticking around for an hour is minimal, in particular
because that would be in the early hours of the morning. The solution
is also quick to implement and keeps the protection against a DOS attack.
The value of 1 hour for expiry was a fairly abitrary choice, it could
have been as little as 1 minute or as much as 6 hours.
This has several advantages:
- It gives us more room to explain the error and actions. This will
be useful for upcoming work we want to do, which will add yet more
validations for CSV uploads.
- We already use a flash to show certain kinds of errors on these
pages (just above). This is more consistent.
- It's potentially more accessible. Previously the error and the
button text used to be read out as a single sentence. Now the page
reloads and reads the flash error alone.
In theory we should show an error in both places, but this can be
confusing on pages where there's only a single form control, and
especially if the error is long.
These are no longer used anywhere. Note that pyexcel-xlsx is still
a necessary dependency: it acts as an implicit plugin to pyexcel
that, surprisingly, allows pyexcel to...read excel files [1].
[1]: https://github.com/pyexcel/pyexcel-xlsx#as-a-pyexcel-plugin
The original idea behind was to always ask users to re-enter their
password any time:
- we want them to be sure that they want to do what they’re about to do
- we want to be sure it’s really the user trying to do the thing (and
not someone malicious)
In reality we:
- removed this from the initial place it was added (a descendent of the
‘suspend service’ feature)
- only ever added it to the ‘rename service’ feature
So in reality it’s not a pattern we have persisted with. Arguably there
are several things you can now do in the admin app without re-entering
your password which are much more high consequence than changing the
service name.
Also, with browser autofill there’s a lot less chance that forcing
someone to re-enter a password really gives much defence against an
unatteneded laptop, for example.
I also wonder whether we might get people to give better service names
if we make the process of renaming the service less intimidating.
So this commit removes the need to re-enter your password when renaming
a service.
Note that re-naming an organisation still has the same check, but I
haven’t removed that too for the sake of keeping scope of the PR small.
Looking up the coordinate reference system for a given polygon’s bounds
is surprisingly expensive.
We can make things run faster by precomputing it at the time of
generating the simplified polygons, then storing it in the SQLite
database alongside the polygon data.
Transforming full resolution polygons to linear coordinates is somewhat
expensive. We can make things run faster by:
- looking at `simple_polygons` which have fewer points and are therefore
faster to transform.
- reusing polygons that have already been transformed where possible
Brings in https://github.com/alphagov/notifications-utils/pull/889/files
At the moment, we are not doing any transformation of features before
applying geometric algorithms to them. This is, in effect, assuming that
the earth is flat.
This new version of utils implements the transformation of our polygons
to a Cartesian plane. In other words, it converts them from being
defined in spherical degrees to metres.
For the admin app this means we need to convert places where the code
expects things to be measured in degrees to work in metres instead.