Usage for all services is a platform admin report that groups letters by
postage. We want it to show `europe` and `rest-of-world` letters under a
single category of `international`, so this updates the query to do
that and to order appropriately.
task takes a brodcast_message_id, and makes a post to the cbc-proxy
for now, hardcode the url to the notify stub. the stub requires template
as the admin/api get it, so use the marshmallow schema to json dump it.
Note - this also required us to tweak the BroadcastMessage.serialize
function so that it converts uuids in to ids - flask's jsonify function
does that for free but requests.post doesn't sadly.
if the request fails (either 4xx or 5xx) just raise an exception and let
it bubble up for now - in the future we'll add retry logic
new blueprint `/service/<id>/broadcast-message` with the following
endpoints:
* GET / - get all broadcast messages for a service
* GET /<id> - get a single broadcast message
* POST / - create a new broadcast message
* POST /<id> - update an existing broadcast message's data
* POST /<id>/status - move a broadcast message to a new status
I've kept the regular data update (eg personalisation, start and end
times) separate from the status update, just to keep separation of
concerns a bit more rigid, especially around who can update.
I've included schemas for the three POSTs, they're pretty
straightforward.
This done so that we do not use statsd on our http endpoint.
We decided we do not need metrics that this gave us. If we
change our minds, we will add Prometheus-friendly decorators
instead in the future.
had to go through the code and change a few places where we filter on
template types. i specifically didn't worry about jobs or notifications.
Also, add braodcast_data - a json column that might contain arbitrary
broadcast data that we'll figure out as we go. We don't know what it'll
look like, but it should be returned by the API
This commit turns off StatsD metrics for the following
- the `dao_create_notification` function
- the `user-agent` metric
- the response times and response codes per flask endpoint
This has been done with the purpose of not having the creation of text
messages or emails call out to StatsD during the request process. These
are the three current metrics that are currently called during the
processing of one of those requests and so have been removed from the
API.
The reason for removing the calls out to StatsD when processing a
request to send a notification is that we have seen two incidents
recently affected by DNS resolution for StatsD (either by a slow down in
resolution time or a failure to resolve). These POST requests are our
most critical code path and we don't want them to be affected by any
potential unforeseen trouble with StatsD DNS resolution.
We are not going to miss the removal of these metrics.
- the `dao_create_notification` metric is rarely/never looked at
- the `user-agent` metric is rarely/never looked at and can be got from
our logs if we want it
- the response times and response codes per flask endpoint are already
exposed using the gds metrics python library
I did not remove the statsd metrics from any other parts of the API
because
- As the POST notification endpoints are the main source of web traffic,
we should have already removed most calls to StatsD which should
greatly reduce the chance of their being any further issues with
DNS resolution
- Some of the other metrics still provide value so no point deleting
them if we don't need to
- The metrics on celery tasks will not affect any HTTP requests from
users as they are async and also we do not currently have the
infrastructure set up to replace them with a prometheus HTTP endpoint that
can be scraped so this would require more work
Years ago we started to implement a way to schedule a notification. We hit a problem but we never came up with a good solution and the feature never made it back to the top of the priority list.
This PR removes the code for scheduled_for. There will be another PR to drop the scheduled_notifications table and remove the schedule_notifications service permission
Unfortunately, I don't think we can remove the `scheduled_for` attribute from the notification.serialized method because out clients might fail if something is missing. For now I have left it in but defaulted the value to None.
This copies what we do to a user's email address when archiving the user
by prefixing it with `_archived_{date}`. We already prefixed the
service name and email_reply_to with `_archived`, but this didn't allow
a service with the same name to be archived more than once.
When running the purge command I found about 4 users who could not be
deleted because their user id was still referenced in the services table
as they had created the service yet they were not a member of that
service anymore.
I have fixed this by checking that if they are not a member but created
the service then we also delete the service for them.
Note, I've followed the previous convention of no tests for this
function. I've run it locally and executed the code path so there should
be no major flaws in the code. There is a small chance I wasn't able to
exactly replicate the state that existed in preview on my local but
hopefully it was close enough to be accurate.
Rather than showing all jobs that have been ‘copied’ from a contact list
I think it makes more sense to group them under the contact list. This
way it’s easier to see what messages have been sent to a given group of
contacts over time.
Part of this work means the API needs to return only jobs that have been
created from a given contact list, when asked.
We want to add another argument here, and doing so would make the line
length too long with all the arguments on one line.
Also uses the * operator to enforce keyword-only arguments.
Because we won’t be showing uploaded letters individually on the uploads
page any more we need a way of listing them. This should be by printing
day, to match how we’re grouping them on the uploads page.
The response matches a normal `get_notifications` response so we can
reuse the same code in the admin app.
Some teams have started uploading quite a lot of letters (in the
hundreds per week). They’re also uploading CSVs of emails. This means
the uploads page ends up quite jumbled.
This is because:
- there’s just a lot of items to scan through
- conceptually it’s a bit odd to have batches of things displayed
alongside individual things on the same page
So instead this commit starts grouping together uploaded letters. It
does this by the date on which we ‘start’ printing them, or in other
words the time at which they can no longer be cancelled.
This feels like a natural grouping, and it matches what we know about
people’s mental models of ‘batches’ and ‘runs’ when talking about
printing.
The code for this is a bit gnarly because:
- timezones
- the print cutoff doesn’t align with the end of a day
- we have to do this in SQL because it wouldn’t be efficient to query
thousands of letters and then do the timezone calculations on them in
Python
The standard way that we indicate that there are more results than can
be returned is by paginating. So even though we don’t intend to paginate
the search results in the admin app, it can still use the presence or
absence of a ‘next’ link to determine whether or not to show a message
about only showing the first 50 results.
The '/service/monthly-data-by-service` endpoint which is used for the
'Monthly notification statuses for live services' Platform Admin report
did not including `pending` notifications. This updates the DAO function
that the endpoint calls to group `pending` and `sending` notifications together.
We were doing this temporarily while the `normalised_to` field was not
populated for letters. Once we have a week of data in the
`normalised_to` field we can stop looking in the `to` field. This should
improve performance because:
- it’s one `WHERE` clause not two
- we had to do `ilike` on the `to` field, because we don’t lowercase its
contents – even if the two where clauses are somehow paralleized it’s
is slower than a `like` comparison, which is case-sensitive
Depends on:
- [ ] Tuesday 5 May (7 full days after deploying https://github.com/alphagov/notifications-api/pull/2814)
By not having a catch-all else, it makes it clearer what we’re
expecting. And then we think it’s worth adding a comment explaining why
we normalise as we do for letters and the `None` case.
Like we have search by email address or phone number, finding an
individual letter is a common task. At the moment users are having to
click through pages and pages of letters to find the one they’re looking
for.
We have to search in the `to` and `normalised_to` fields for now because
we’re not populating the `normalised_to` column for letters at the
moment.
"if you want something done right, then do it yourself"
The ORM was building a weird and inefficient query to get all the users
for a given organisation_id:
old query:
```sql
SELECT users.*
FROM users
WHERE (
EXISTS (
SELECT 1
FROM user_to_organisation, organisation
WHERE users.id = user_to_organisation.user_id AND
organisation.id = user_to_organisation.organisation_id AND
organisation.id = '63b9557c-22ea-42ac-bcba-edaa50e3ae51'
)
) AND
users.state = 'active'
ORDER BY users.created_at
```
Note the cross join on users_to_org and org, there are a lot of users in
organisations on preview, as one is added every functional test run.
Even though they're all filtered out and the query returns the correct
data, it still does the nested loop under the hood.
new query:
```sql
SELECT users.*
FROM users
JOIN user_to_organisation AS user_to_organisation_1 ON users.id = user_to_organisation_1.user_id
JOIN organisation ON organisation.id = user_to_organisation_1.organisation_id
WHERE organisation.id = '63b9557c-22ea-42ac-bcba-edaa50e3ae51' AND users.state = 'active' ORDER BY users.created_at;
```
Much more straightforward.
* it doesn't delete service email reply to or letter contacts, or
contact lists. I don't think the contact lists will ever be an issue
but it doesn't hurt to add it to the list of things to remove.
* it doesn't remove users from organisations before deleting the users
there may be more tables that link to Service that should be deleted,
but for now just add these ones that I could spot.
So we keep a record of who first uploaded a list it’s better to archive
a list than completely delete it.
The list in the database doesn’t contain any recipient info so this
isn’t a change to what data we’re retaining.
This means updating the endpoints that get contact lists to exclude ones
that are archived.
This was one of things we de-scoped when we first shipped this feature.
In order to safely delete a list, we first need to make sure any jobs
aren’t referencing it.
- fix test name
- make query filter consistent with each other
- add comment for clarity
- add inner loop to continue to insert and delete notifications while the delete count is greater than 0