If an organisation doesn’t have any live services then there’s no data
to download. To make things less confusing we should hide the link in
this case.
This commit also modifies the existing test so that the assertions are
consistent.
If the organisation table contains an entry for `agreement_signed_by_id`
and for `agreement_signed_on_behalf_of_name` then we should the person
who signed the MOU as being the `agreement_signed_on_behalf_of_name`.
This was wrongly showing the `agreement_signed_by_id` as the person who
signed the agreement.
We use the `ChangeEmailForm` if you want to change your own email
address or someone else's email address. This has various validators
which get run. We check if the email address is valid (by using a
function from utils) and if the email address is already in use
(by calling API).
If the email address is not valid, we should not call API to see if it's
already in use because this will cause an exception in API leading to a
`500` in admin. We now only call API if there were no other errors with
the email address.
(The `test_should_redirect_after_name_change` test didn't need the
`mock_email_is_not_already_in_use` fixture, so this has been removed.)
If there were multiple errors, this widget was joining the messages
together and displaying all error messages. If a text input field does
have more than one validation error, we only want to show one.
This adds a link to the user profile of the person who requested to go
live for "Request to go live" Zendesk tickets. Viewing a user's profile
page helps us to check for duplicate organisations and services from
that user.
We want organisation team members to be able to see the MOU details for
their organisation. This change creates a new page called billing, which
contains these details. It's only visible to platform admin users now -
the plan is to add more information to this page, then to make it visible
to all organisation users.
The page showing the MOU covers the case of when agreement_signed is
True, when an agreement_signed is False, and when agreement_signed is
None. The case when an agreement_signed is None is very rare - it
signifies that the agreement is not signed but that we have some
service-specific agreements in place. We only have a few organisations
in this state, so it's unlikely that the content for this scenario will
be seen.
When an organisation has signed the agreement we may know the full
details (signing date, version signed, the person who signed it or who it
was signed on behalf of), or we may only have the name of the person who
signed the agreement. We show the more detailed content if possible, and
a less detailed version of the content if not.
There's a new route for downloading the agreement which is almost
identical to the existing `.service_download_agreement` route (plus the
test is almost the same), except that it takes an organisation ID
instead of a service ID.
This stat is shown on a few of our pages, such as our homepage,
the performance page and also a platform admin page
and is currently catched for the
default TTL of 1 week. I think there is no reason we can't make
this only cache once an hour and give slightly more up to date stats
which will update more regularly.
This mimics the approach and also the TTL choice of 1 hour that has
been added for the performance page (although there is no
particular bug here to fix, it is just nice to have slightly more up
up to date data).
Note, the API call only takes about 0.3 seconds at the moment
so it is not particularly intensive on the DB to run this more
regularly.
This is a quick additional check to protect the user:
- From getting a CloudFront 502 error if the file takes too
long to upload. I was surprised to find it takes about 1 minute
to upload a 70Mb file to S3.*
- From getting a CloudFront 502 error when we follow the redirect
and run through the slow processing code in utils that builds a
RecipientCSV [1].
For context, a CSV with 100K rows and a few columns is around 5Mb,
so a 10Mb limit should be enough. Analysis over the past week shows
that the vast majority of CSV uploads are actually < 2.5Mb.
I haven't added any tests for this because:
- The check isn't critical, as the worst case scenario is the user
gets a worse error than this in-app one.
- There's no easy way to mock the validation, and I didn't want to
have a test that depends on a 10Mb+ file.
*We're using "key.put" to upload the file, when we could be doing
a multipart upload [2]. However, I tried this myself with a chunk
size of 1000 bytes and found it only led to a marginal improvement.
[1]: https://github.com/alphagov/notifications-utils/pull/930
[2]: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html
When we combine simplified polygons from different areas we try to use
the polygons that are defined in metres as a speed optimisation.
This doesn’t work when those polygons are in several different
projections because the distance in metres is relative to the edge of
the projection. Just naively choosing one projection means that some
of the polygons will shift position on the earth by 100s of kilometers
once they are converted back to degrees.
To avoid this we need to:
- check for situations where we have polygons in multiple different
projections
- transforms these polygons back into degrees and let `Polygons` pick a
common projection for all of them before doing any
merging/smoothing/etc.
I came across a bug where the performance page might be visited
on say the 22nd and we expect it to show data for yesterday and
the past 7 days (so the 21st and back 7 days) but instead it
only shows it for the 20th and then back 6 days.
The cause is that we have nightly tasks that run to
calculate the number of notifications sent per service (the
ft_notification_status table)
calculate the number of notifications sent within 10 seconds
(the ft_processing_time table)
Both of these tables get filled between the hours of midnight and
2:30am. The first seems to be filled by about 12:30 every night
whereas the processing stats seems to be filled about 2am every
night.
The admin app currently caches the results of the call to the API
(https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L24)
to get this data with the key
performance-stats-{start_date}-to-{end_date}.
Now the issue here is that if the performance page is visited at
00:01 on the 23rd, it will call the API for data on the 22nd and
backwards 7 days. The data it gets at 00:01 will not yet have the
data for the 22nd, it will only go as far as the 21st, because
the overnight jobs to put data in the tables have not run yet. The
admin app then caches this data, meaning that for the rest of the
day the page will be missing data for the 22nd, even though it
becomes available after approx 2am. We have seen it a few times
in production where the performance page has indeed been visited
before say 2 am, leaving the page with missing data.
In terms of solutions, there are several potential options.
1. Remove the caching that we do in redis. The downside of this is
that we will hit the API more (currently once a day for the first
visitor to the performance page). We have had 255 requests
to the performance page in the last week and when there is no value
in redis, the call to the API takes approximately 1-2 seconds.
Removing the caching would against why the caching was originally
added which was to act as a basic protection against malicious
DOS attempts.
2. Make the admin app only set the cache value if it is after
2:30am. This one feels a bit gross and snowflakey.
3. The API flushes the redis cache after it has finished its nightly
tasks. We don't have that much precedent of the API interacting
with redis, it is mostly the admin app that does but this is still
a valid option.
4. Cache in redis the data for 2 days ago to 7 days ago and then
get the data for yesterday freshly every time. This would mean
some things are still cached but the smallest bit that we can't rely
on can be gathered fresh.
5. Remove the caching we do in redis but then try and optimise the
API endpoint to return results even faster and with less
interaction with the DB. My hypothesis is that the slowest part
is where the API has to calculate how many notifications Notify
has ever sent
(https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L36)
and so we could potentially try and store these totals in the DB
every night and update them nightly rather than having to query all
the data in the table to figure them out every time.
6. Reduce the expiry time of the cache key so although the bug will
still exist, it will only exist for a much shorter time.
In the end, we've gone for option 6. The user impact of
this bug sticking around for an hour is minimal, in particular
because that would be in the early hours of the morning. The solution
is also quick to implement and keeps the protection against a DOS attack.
The value of 1 hour for expiry was a fairly abitrary choice, it could
have been as little as 1 minute or as much as 6 hours.
This has several advantages:
- It gives us more room to explain the error and actions. This will
be useful for upcoming work we want to do, which will add yet more
validations for CSV uploads.
- We already use a flash to show certain kinds of errors on these
pages (just above). This is more consistent.
- It's potentially more accessible. Previously the error and the
button text used to be read out as a single sentence. Now the page
reloads and reads the flash error alone.
In theory we should show an error in both places, but this can be
confusing on pages where there's only a single form control, and
especially if the error is long.
The original idea behind was to always ask users to re-enter their
password any time:
- we want them to be sure that they want to do what they’re about to do
- we want to be sure it’s really the user trying to do the thing (and
not someone malicious)
In reality we:
- removed this from the initial place it was added (a descendent of the
‘suspend service’ feature)
- only ever added it to the ‘rename service’ feature
So in reality it’s not a pattern we have persisted with. Arguably there
are several things you can now do in the admin app without re-entering
your password which are much more high consequence than changing the
service name.
Also, with browser autofill there’s a lot less chance that forcing
someone to re-enter a password really gives much defence against an
unatteneded laptop, for example.
I also wonder whether we might get people to give better service names
if we make the process of renaming the service less intimidating.
So this commit removes the need to re-enter your password when renaming
a service.
Note that re-naming an organisation still has the same check, but I
haven’t removed that too for the sake of keeping scope of the PR small.
Transforming full resolution polygons to linear coordinates is somewhat
expensive. We can make things run faster by:
- looking at `simple_polygons` which have fewer points and are therefore
faster to transform.
- reusing polygons that have already been transformed where possible
Brings in https://github.com/alphagov/notifications-utils/pull/889/files
At the moment, we are not doing any transformation of features before
applying geometric algorithms to them. This is, in effect, assuming that
the earth is flat.
This new version of utils implements the transformation of our polygons
to a Cartesian plane. In other words, it converts them from being
defined in spherical degrees to metres.
For the admin app this means we need to convert places where the code
expects things to be measured in degrees to work in metres instead.
We think that the API is returning incorrect data for this column.
It’s going to take a while to figure out what’s going on with the
queries in the API, so this pull request temporarily removes the column
so we’ve not giving people incorrect data.
We are starting to see lots of 100.0%s in the current table
and we think this looks suspiciously too good so think it is
beneficial to change it to be 2dp such that we get a few more
non 100.0% values.
This will put all the values in to having 2dp, however it will
also require the API to have a change to
https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L81
where it is currently losing the granularity down to a single
decimal place (meaning that if we were really at 99.894% then that
would be shown on the page as 99.90% rather than 99.89%). However,
I don't think it is a blocker that we get that sorted before this
can be merged.
This means that the data in the report will match what’s on the page,
where the values are rounded to the nearest penny.
This uses the same string formatting to round the numbers which the
`big_number` component does, so it should round the numbers in the same
way.
This is so org level users can use this data easier for things
like determining spending per service.
We do not include sms fragments sent column and remove other sms columns
consistency.
Do not add sms fragments sent column for now until we agree on an
unambiguous name for it. The data in this column is sms billing units
multiplied by international sms weighing. My favourite for a clear
name would be 'text message credits used', but we need a naming
strategy for this.
This makes it clearer that this model collection isn’t the organisations
for a user or a service or some other entity, like most model
collections are.
It will also lets us make a separate Organisations model, without the
name conflicting.