Commit Graph

4168 Commits

Author SHA1 Message Date
Leo Hemsted
e497cbbec6 Merge pull request #3474 from alphagov/fix-returned-letters
split returned letters tasks into a max count of returned letters
2022-03-02 11:49:12 +00:00
Leo Hemsted
b1636b7a1a split returned letters tasks into a max count of returned letters
if we have too many returned letters, we'll exceed SQS's max task size
of 256kb. Cap it to 5000 - this is probably a bit conservative but
follows the initial values we used when implementing this for the
collate-letters-task[^1]. Also follow the pattern of compressing the
sqs payload just to reduce it a little more.

[^1]: https://github.com/alphagov/notifications-api/pull/1536
2022-03-02 10:51:08 +00:00
Katie Smith
67d1b3719e Stop blank strings being inserted as inbound numbers
We had an inbound number in the database with a value of ''. This
could happen if there are blank lines in the inbound numbers file
we use for the `insert-inbound-numbers` command. To avoid this
happening again, the command now calls `.strip()` on each line of the
file and only inserts a row if the result is truthy (i.e. not '').
2022-03-01 15:31:54 +00:00
Katie Smith
5ef9198869 Merge pull request #3469 from alphagov/free-allowance-0
Allow SMS fragment limit to be 0
2022-02-28 14:10:44 +00:00
Katie Smith
bff4e3a709 Fix test which was flakey due to order of items returned from db
The test was querying `FactNotificationStatus` and ordering the results
by bst_date and notification_type then checking the rows. However, the
bst_date and notification_type for each row is the same, so this test
could fail based on the order that the results came back in. By ordering
on the notification_status instead, we can be sure of the order of the
results.
2022-02-28 13:03:55 +00:00
Katie Smith
c9dafa0372 Remove unused test code
Notifications don't have a `scheduled_for` attribute, and the test
function deleted isn't used anywhere.
2022-02-28 12:45:25 +00:00
Chris Hill-Scott
9c2f0ce9db Clear cache when cancelling broadcast via the API
Before we implemented ‘cancel’ any updates to a broadcast went through
the admin app. This meant the admin app could deal with clearing the
cache any time a broadcast was updated by a user performing an action.

Now that a broadcast can be updated without the admin app being involved
we have another place we need to clear the cache from.

If we don’t do this then the broadcast can look like it’s still going
even though it’s successfully been cancelled.
2022-02-22 16:26:05 +00:00
Chris Hill-Scott
4b4122a773 Merge pull request #3461 from alphagov/be-more-robust-around-references-to-cancel
Be more robust in handling ambiguous references to cancel an alert
2022-02-21 10:39:48 +00:00
Chris Hill-Scott
cc207ac11f Raise error if multiple broadcasts found for reference
Because the `<reference>` field of a `cancel` message can contain an
arbitrary number of items it’s possible for it to reference more than
one current alert.

In this case it is ambiguous which alert should be cancelled, so we
should raise a custom error.

This will help people know that they have to manually go into Notify and
figure out which alert(s) to cancel there.
2022-02-17 15:23:13 +00:00
Chris Hill-Scott
f691bc2a92 Only lookup broadcasts which can be cancelled
It is possible that, among the references Environment Agency give us for
which broadcast to cancel, there could be references for older, already
expired broadcasts.

This would be the case if someone cancelled a broadcast in Notify, then
issued and try to re-cancel another broadcast to the same area. The
Flood Warning Service has no way of knowing that the first broadcast has
been cancelled in Notify already, so it would add the reference to the
list of things to be cancelled.

We can avoid this from happening by filtering-out already-cancelled and
expired broadcasts before looking up which one should be cancelled.
2022-02-17 15:23:13 +00:00
Ben Thorner
8fac5c72db Merge pull request #3454 from alphagov/upsert-status-180693991
Rewrite status aggregation to be a bulk upsert
2022-02-17 13:21:50 +00:00
Chris Hill-Scott
d73131bbec Allow cancel of alert via API with no description
The XML for an alerts requires a `<description>` field. The XML for
a `<cancel>` may have a `<description>` field populated (although we
ignore the contents) but it may also be empty.

This commit updates the schema to leave the all the validation to the
view layer, which can decide when or when not to validate the content of
the `<description>` field.
2022-02-16 15:31:50 +00:00
Chris Hill-Scott
288f5d11cd Test that alert with no references is OK
References are optional, and we fixed errors when they are not provided
in bbc444699a

This commit amends the tests so that they cover the `Alert` message type
as well as the `Cancel` (which is already covered).
2022-02-16 15:31:47 +00:00
Ben Thorner
574b1d3c63 Fix not deleting dead rows in status table
To address: https://github.com/alphagov/notifications-api/pull/3454#pullrequestreview-880302729

Since we now delete all rows before inserting fresh ones, we no
longer need to worry about conflicts. I've also extended the old
test to check all three kinds of overwrite: new, changed, gone.
2022-02-16 13:40:08 +00:00
Ben Thorner
a69d1635a1 Update FactStatus table in bulk for each service
Previously we were looping over data from the Notifications/History
table and then shovelling it into the status table, one row at a time
- plus an extra delete to clean up any existing data.

This replaces that with a batch insertion, similar to how we archive
notifications [1], but using a simple subquery (via "from_select" [2])
instead of a temporary table.

To make the select compatible with the insert, I've used "literal"
to inject the constant pieces of data, so each row has everything it
needs to go into the status table.

[1]: 9ce6d2fe92/app/dao/notifications_dao.py (L295)
[2]: https://docs.sqlalchemy.org/en/14/core/dml.html#sqlalchemy.sql.expression.Insert.from_select
2022-02-16 13:40:05 +00:00
Chris Hill-Scott
bbc444699a Return 400 if references missing from cancel broadcast
If someone tries to cancel a broadcast but the references don’t match
and existing broadcast we correctly return a 404.

If they don’t provide any references then we get an exception. This
commit catches the missing references and returns a 400. I think this
is more appropriate because it’s malformed request, rather than a
well-formed request that doesn’t match our data. It also lets us write a
more specific and helpful error message.
2022-02-14 12:34:09 +00:00
Ben Thorner
5a87d8c7d7 Merge pull request #3437 from alphagov/retry-parallel-status-180693991
Attempt 2 of parallelising status aggregation more
2022-02-14 10:50:26 +00:00
Pea Tyczynska
3dc1907321 Audit api key id when cancelling broadcast via api 2022-02-11 12:01:56 +00:00
Ben Thorner
966c4db8c6 Fix getting service IDs for status aggregation
Addresses [1].

Previously the query would always use UTC midnight, even after we
had switched to BST (+1h). We store timestamps as naive UTC in our
DB - without a timezone - but we want the query to work in terms
of GMT / BST so we adjust for that - BST midnight is 11PM in UTC.

[1]: https://github.com/alphagov/notifications-api/pull/3437#discussion_r791998690
2022-02-10 10:51:45 +00:00
Ben Thorner
6e8f121548 Standardise how we query midnight-to-midnight
Partially addresses [1] (lots more detail to read in the comment).
I've also added some tests for the status DAO function to confirm
it behaves as expected across timezones.

[1]: https://github.com/alphagov/notifications-api/pull/3437#discussion_r802634913
2022-02-10 10:51:27 +00:00
Ben Thorner
1213463b8e Only aggregate status when necessary for a service
This takes a similar approach to the nightly deletion task so that
we only create sub-tasks when there are actually notifications to
aggregate for a given type and day [1].

We're making this change to stop the duplication errors we're getting
at the moment and ensure the task can scale to more messages and more
services. There are two parts to this:

- Each subtask should now run within the 5 minute visibility timeout.
However, they may still be duplicated if the parent task overruns [2].

- The parent task creates a mininal number of subtasks, and the query
to determine this is very fast for a normal process day (milliseconds).

Since all tasks will run quickly, there should be no more duplication.

In order to test this more nuanced task, I rewrote the tests:

- One test checks the subtask is called correctly.
- One test checks we create all the right subtasks.

[1]: https://github.com/alphagov/notifications-api/pull/3381
[2]: https://docs.google.com/document/d/1MaP6Nyy3nJKkuh_4lP1wuDm19X8LZITOLRd9n3Ax-xg/edit#heading=h.q3intzwqhfzl
2022-02-09 17:39:07 +00:00
Ben Thorner
018a253b6f Revert "Revert running status aggregation in parallel"
This reverts commit 0f6dea0deb.
2022-02-09 17:39:00 +00:00
Pea Tyczynska
a780933893 Revert "Audit api key id when cancelling broadcast via api" 2022-02-09 11:01:39 +00:00
Pea Tyczynska
d05bff9efc Merge pull request #3440 from alphagov/audit-api-key-id-when-cancelling-broadcast-via-api
Audit api key id when cancelling broadcast via api
2022-02-09 10:15:03 +00:00
Chris Hill-Scott
0614a70764 Merge pull request #3445 from alphagov/bump-utils-53.0.0
Bump utils to 53.0.0
2022-02-08 09:56:18 +00:00
Chris Hill-Scott
7f72d3a60f Bump utils to 53.0.0
Changes:

53.0.0
---

* `notifications_utils.columns.Columns` has moved to
  `notifications_utils.insensitive_dict.InsensitiveDict`
* `notifications_utils.columns.Rows` has moved to
  `notifications_utils.recipients.Rows`
* `notifications_utils.columns.Cell` has moved to
  `notifications_utils.recipients.Cell`

52.0.0
---

* Deprecate the following unused `redis_client` functions:
  - `redis_client.increment_hash_value`
  - `redis_client.decrement_hash_value`
  - `redis_client.get_all_from_hash`
  - `redis_client.set_hash_and_expire`
  - `redis_client.expire`

51.3.1
---

* Bump govuk-bank-holidays to cache holidays for next year.
2022-02-08 09:45:10 +00:00
David McDonald
1d8fafcdf4 Remove unused functions
Can't see these being used anywhere so lets get rid of them
2022-02-07 15:58:04 +00:00
Chris Hill-Scott
07f584e1d5 Allow admin app to specify domain for password reset
This follows the pattern for invite emails where the admin app tells the
API which domain to use when generating the link.

This will starting working once this admin change is merged:
- [ ] https://github.com/alphagov/notifications-admin/pull/4150/files

It won’t break anything if it’s merged before the admin change.
2022-02-02 17:15:09 +00:00
Pea Tyczynska
ac5967bc5a Merge pull request #3430 from alphagov/rename_billing_report_column_sms_fragments
Rename sms_fragments to sms_chargeable_units
2022-02-02 10:00:30 +00:00
Rebecca Law
09c8fbe982 Merge pull request #3418 from alphagov/letters-too-long
Mark letters as validation-failed if the templated letter is too long.
2022-02-02 08:30:50 +00:00
Chris Hill-Scott
df28f253fa Merge pull request #3427 from alphagov/remove-name-uniqueness-endpoints
Remove endpoints for checking name uniqueness
2022-01-27 13:17:41 +00:00
Pea Tyczynska
82f08f230c Save api key id when cancelling broadcast by API call
This is so that we can audit who cancelled the broadcast if
there are any issues.
2022-01-26 17:26:58 +00:00
Rebecca Law
c01c81326c Update log message to something a little easier to read and query for. 2022-01-24 12:25:53 +00:00
Pea Tyczynska
4a90cde701 Merge pull request #3429 from alphagov/cancel_alert_via_api
Cancel broadcast via API
2022-01-21 14:04:35 +00:00
Pea Tyczynska
b6dd189462 Test cancel request via API returns 404 if service id does not match 2022-01-20 18:28:10 +00:00
Pea Tyczynska
52dbdb7518 Move validate_and_update_broadcast_message_status to a utils file
This is because that function is used both when broadcast status
is updated via API and via admin, so it's a shared resource.

Also move and update tests for updating broadcast message status
so things are tested at source and repetition is avoided.
2022-01-20 18:14:41 +00:00
Pea Tyczynska
c9afb2f038 Remove unnecessary error handling
The context here should be enough for the users, custom error
message is not needed.
2022-01-20 18:14:40 +00:00
Ben Thorner
0f6dea0deb Revert running status aggregation in parallel
The top-level task didn't run successfully after this was deployed
due to the worker being killed due to heavy disk usage. While the
more parallel version does log much more, it doesn't totally explain
the disk behaviour. Nonetheless, reverting it is sensible to give us
the time we need to investigate more.
2022-01-20 12:22:33 +00:00
Pea Tyczynska
a4c20e8ba6 Return 404 if reference from cancel message does not match
If the reference from cancel CAP XML we received via API does not
match with any existing broadcast, return 404.

Do the same if service id doesn't match.

Also refactor code to cancel broadcast out into separate function

It should be a separate function that is only called by create_broadcast
function. This will prevent create_broadcast from becoming too
big and complex and doing too many things.
2022-01-19 15:42:27 +00:00
Pea Tyczynska
3b4a9d8942 Cancel broadcast via API
When a service sends us an XML CAP broadcast message with Cancel
status, and that broadcast is in broadcasting state, we cancel it.
2022-01-19 15:42:26 +00:00
Pea Tyczynska
940126abfb Reject unapproved broadcast upon cancel API request
When a service sends us a cancel broadcast XML via API, if that
broadcast was not approved yet, reject it.
2022-01-19 15:41:38 +00:00
Ben Thorner
6be489daa7 Merge pull request #3425 from alphagov/parallelise-ft-status-180693991
Parallelise status aggregation by service and day
2022-01-19 13:49:28 +00:00
Rebecca Law
6cd7a23d3c If there is an invalid letter that has not been updated to validation-failed because the update-validation-failed-for-templated-letter has not been picked up off the letter-tasks queue and the collate-letter-pdfs-to-be-sent has started.
1. The number of letters that we send to DVLA will be not be correct (see 20ead82463/app/celery/letters_pdf_tasks.py (L136))
This may raise an alert with DVLA when they find we have sent them fewer letter than we have reported.
2. When we get the PDF from S3 we will get a file not found 20ead82463/app/celery/letters_pdf_tasks.py (L244)
The error will not prevent the collate task from completing but we will see an alert email for the exception and raise questions.

Although this situation is very unlikely because we have a 15 minute window between the last letter deadline date and the time we kick off the collate task we should still mitigate these issues. I updated the queries to only return letters with billable_units > 0, all valid letters should have at least 1 billable unit.
2022-01-19 08:31:19 +00:00
Rebecca Law
841a4fc22f Mark letters as validation-failed if the templated letter is too long.
It is possible that the personalisation for a templated letter can make the letter exceed 10 pages or 5 sheets. We are not validating the letters posted via the API for this validation error. It is only possible to validate the letter once we create the PDF in notifications-template-preview. This means that the letter can only get a validation-failed status after the client has received a 201 from the POST to /v2/notifications.
NOTE: we only validate the preview row of a CSV for this validation error, this change will mean that it is possible for a letter to be marked as validation-failed after a successful file upload.

A new task to update the notification to `validation-failed` has been added to the API. If we find that the letter is too long once we have created the PDF we call the `update-validation-failed-for-templated-letter` task rather than `update-billable-units-for-letter` task.

New work flow for a letter in brief:
API - receives POST /v2/notifications
:: save to db
:: put CREATE_LETTERS_PDF task on queue for template preview to consume
TEMPLATE-PREVIEW - consumes task CREATE_LETTERS_PDF
:: create PDF
:: count pages of PDF
:: IF page count exceeds 10 pages
	 put in the letters-invalid-pdf S3 bucket with metadata (similar to the precompiled letters)
	 put `update-validation-failed-for-templated-letter` task on the queue for the API to consume
   ELSE
     put PDF in the `letters-pdf` bucket
     put `update-billable-units-for-letter` task on the queue
API - consumes `update-billable-units-for-letter` OR `update-validation-failed-for-templated-letter` task
:: IF `update-billable-units-for-letter` task:
   	update billable units for notification as usual
:: ELSE `update-validation-failed-for-templated-letter`:
   	update notification_status = `validation-failed`
ADMIN - view notification page for letter
:: show validation letter for templated letter

There will be 3 PRs in order to make this change, one for the API, template-preview and the admin app.

Deployment plan

Deploy Admin first
Deploy API
Deploy template-preview

Related PRs:
alphagov/notifications-template-preview#619
alphagov/notifications-admin#4107

https://www.pivotaltracker.com/story/show/169209742
2022-01-19 08:29:48 +00:00
Pea Tyczynska
d94517d379 Rename sms_fragments to sms_chargeable_units
This field caused some confusion and lots of unnecessary work
    to our colleague because of unclear name.

    The field was named sms_fragments, where in fact the value of
    the field is: those sms fragments that go above free allowance
    multiplied by the rate multiplier.

    The new name was chosen through consultation with colleagues
    who use billing report the most.
2022-01-18 18:03:16 +00:00
Chris Hill-Scott
a4347a5165 Remove endpoints for checking name uniqueness
The code which called these endpoints was removed:
- for services in https://github.com/alphagov/notifications-admin/pull/4084/files
- for organisations in https://github.com/alphagov/notifications-admin/pull/4128/files

Therefore these endpoints are no longer needed.
2022-01-13 17:12:40 +00:00
Ben Thorner
9182ebf4e5 Parallelise status aggregation by service and day
This follows a similar approach as [1]. Recently we've seen lots
of errors from this task, which we think are a consequence of it
doing too much work and tripping Celery's visibility timeout.

While we can optimise the query [2], it's likely the errors will
return as the number of live services grows. Parallelising the
aggregation now will make it more futureproof.

[1]: https://github.com/alphagov/notifications-api/pull/3397
[2]: https://github.com/alphagov/notifications-api/pull/3417
2022-01-12 15:47:59 +00:00
Ben Thorner
9fc8b904c6 DRY up status aggregation tests (move DAO tests up)
The previous DAO tests were also confusing because they were testing
two functions at the same time, so moving the tests up to the task
level seems very reasonable, and will make it easier to change how
this code works in the next commits.
2022-01-11 16:11:36 +00:00
Katie Smith
ed725c1513 Add endpoint to allow org team members to be removed
This is similar to the corresponding endpoint for services. However,
it is a little simpler since we don't need to worry about always having
at least one team member for an organisation.

The new dao function added, `dao_remove_user_from_organisation`, is also
simpler than `dao_remove_user_from_service` since we don't have any
organisation permissions to deal with.
2022-01-11 15:20:48 +00:00
Ben Thorner
081e0cab88 Merge pull request #3417 from alphagov/optimise-status-query-180693991
Optimise query to populate notification statuses
2022-01-11 14:18:36 +00:00