Commit Graph

807 Commits

Author SHA1 Message Date
Leo Hemsted
4b61060d32 stream notifications when collating zip files
we had issues where we had 150k 2nd class notifications, and the collate
task never ran properly, presumably because the volume of data being
returned was too big.

to try and help with this, we can switch to streaming rather than using
`.all` and building up lists of data. This should help, though the
initial query may be a problem still.
2020-10-23 12:20:26 +01:00
David McDonald
3dcb97c45a Remove 'INSOLVENCY' from zip files for insolvency letters
This is at request of DVLA. They would prefer to have zip files with the
same number of arguments in the name. After being offered a few
different options, such as including an org and service id for all zips,
they chose to just remove the 'INSOLVENCY' tag.

For more context see PR that added the tag
https://github.com/alphagov/notifications-api/pull/3006
2020-10-23 09:58:28 +01:00
David McDonald
890a66de46 Merge pull request #3003 from alphagov/staging-preview-cbc
Non-production environments invoke CBC Proxy during broadcast event creation
2020-10-22 16:21:11 +01:00
Pea Tyczynska
deb360846d Mark letters from insolvency service
So that DVLA knows to process them separately to avoid issues.
2020-10-21 16:56:18 +01:00
Pea Tyczynska
d745ba310e Divide letters by service when putting in ZIPs
When letters are sent to DVLA, we will now put them in a separate
ZIP file for each service, so that if there are printing issues
due to bad files from one service, other services will hopefully
not be affected by that.
2020-10-21 15:19:06 +01:00
Toby Lorne
2cc0a65851 celery: broadcast msg create invokes cbc proxy
When we create a broadcast message, we should invoke the cbc proxy to
send a cap message

Either a function will be invoked within AWS, or a noop function call
is made, depending on the environment

We have only implemented CB message creation in the CBC Proxy, without
polygons, therefore we:
* only invoke the CBC Proxy during message creation
* only send description, identifier, and hard-coded headline

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>
Co-authored-by: Katie <katie.smith@digital.cabinet-office.gov.uk>
2020-10-20 11:57:26 +01:00
Chris Hill-Scott
f2314333b5 Merge pull request #2982 from alphagov/improve-efficiency-of-process-missing-rows
Improve efficiency of process missing rows task
2020-10-02 11:23:35 +01:00
Leo Hemsted
59839d21b8 Merge pull request #2974 from alphagov/make-letter-folder-name-code-readable
Make letter folder name code readable
2020-09-28 13:58:08 +01:00
Rebecca Law
8974ec2550 Merge pull request #2985 from alphagov/reduce-logs
Reduce logs
2020-09-28 12:05:17 +01:00
Rebecca Law
d1f641633d Reduce logs
The log message is not necessary, let's get rid.
2020-09-28 11:53:54 +01:00
Chris Hill-Scott
1d50bfaafc Remove unused column from query 2020-09-26 12:11:15 +01:00
Chris Hill-Scott
9ad33045dd Improve efficiency of process missing rows task
For every missing row this was:
- downloading the CSV file from S3
- looping through every row in it until it found the one matching the
  index of the missing row

`RecipientCSV` implements `__getitem__`[1] (which maybe it didn’t
before) so we can create it once, then index the relevant row directly.

***

1. 5ae0572d41/notifications_utils/recipients.py (L78-L79)
2020-09-26 12:06:48 +01:00
Rebecca Law
d7e53cdf50 Adding reference to message for "precompiled letters have been pending-virus-check for over 90 minutes"
This saves having to go to the db to get it.
2020-09-24 14:16:16 +01:00
Leo Hemsted
1e928a926a rename sending_date to created_at
we don't name letters based on the day we send them on, rather, the day
we create them on. If we process a letter for a second time for whatever
reason, even if it's a couple of days later, it'll still go in a folder
based on the created_at timestamp. There's still a slight confusion,
however - if the timestamp is after 5:30pm, the folder will be for the
day after. However, still the day after creation, so I think created_at
still makes the most sense.

Remove the term `sending_date` to try and make this relationship more
apparent.
2020-09-21 14:40:22 +01:00
Leo Hemsted
bb33927b3d rename letter get_folder_name args
`_now`? why would we ever use a different _now? instead say created_at,
because that's what it'll always be set to, even if we're replaying old
letters. We always set the folder name to when the letter was
created_at, or we might not know where to look to find it.

`dont_use_sending_date` doesn't really tell us what might happen if we
don't use it - the answer is we return an empty string. we ignore the
folder entirely. so lets call it that.

Also, remove use of freeze_gun in the tests, to prove that we don't use
the current time in any calculations. Also add an assert to a mock in
the get_pdf_for_templated_letter test, because we were mocking but not
asserting before, so the tests didn't fail when the function signature
changed.
2020-09-21 14:32:57 +01:00
Katie Smith
aee7887c14 Fix the filenames for international precompiled letters
We were determing the filename for precompiled letters before we had
checked if the letters were international. This meant that a letter
could have a filename indicating it was 2nd class, but once we had
sanitised the letter and checked the address we were setting the
notification to international.

This stopped these letters from being picked up to be sent to the DVLA,
since the filename and postage of the letter did not match.

We now regenerate the filename after the letter has been sanitised (and when
we know the postage) and use the updated filename when moving the letter
into the live PDF letters bucket.
2020-09-15 16:17:33 +01:00
David McDonald
5b2dee5ddb Bump utils to 42.0.0
Requires unit test updating as we now expect broadcast event areas to
be a dict containing a list of areas and simple polygons.
2020-09-14 15:21:55 +01:00
David McDonald
d352c99142 Remove unused send_broadcast_message task
We only call send_broadcast_event now
2020-09-14 15:16:59 +01:00
Leo Hemsted
bdf2253298 send broadcast events rather than messages
use the new endpoint from cbc proxy. create a new task that just
serializes the event and sends it across rather than sending a template
and the broadcast message.

some changes to serialize to make it json friendly etc. it also expects
sent_at and transmitted_finishes_at to always be set (we set them in the
code but don't enforce it n the DB right now), as they're required by
utils template. not sure whether we'll update db constraints to be more
strict or utils template to be more permissive just yet, wait until we
find out more about the requirements of the CBCs we integrate with.
2020-08-14 17:41:44 +01:00
David McDonald
36614e5492 Log warning for SES send rate throttling rather than exception
We have hit throttling limits from SES approximately once a week during
a spike of traffic from GOV.UK. The rate limiting usually only lasts a
couple of minutes but generates enough exceptions to cause a p1 but with
no potential action for the responder.

Therefore we downgrade the warning for this case to a warning and assume
traffic will level back out such that the problem resolves itself.

Note, we will still get exceptions if we go over our daily limit, rather
than our per minute sending limit, which does require immediate action
by someone responding.

If we were to continually go over our per second sending rate for a long
continous period of time, then there is a chance we may not be aware but
given the risk of this happening is low I think it's an acceptable risk
for the moment.
2020-08-13 17:51:09 +01:00
Rebecca Law
89a8d8912a Set postage and international for letters uploaded with a CSV
If the letter is outside of the United Kingdom, set the postage and international flag.
2020-08-10 09:58:49 +01:00
Rebecca Law
4a9f9e4b17 Remove the template_postage parameter for persist_notification
It was confusing to have 2 differnt postage parameters.
2020-08-06 07:35:13 +01:00
Rebecca Law
10fe7d9fe8 Add postage for send-one-off letters.
The postage is set to europe or rest-of-world for international letters, otherwise the template postage is used.

Also set international for letters.
2020-08-03 14:01:59 +01:00
Rebecca Law
dd126df122 We have a scheduled task to check that all the jobs have completed, this will catch if an app is shut down and the job is complete yet, we only wait 10 seconds before forcing the app to shut down.
The task was raising a JobIncompleteError, yet it's not an error the task is performing it's task correctly and calling the appropriate task to restart the job.
Also used apply_sync to create the task instead of send_task.
2020-07-22 17:00:20 +01:00
David McDonald
ef551247b5 Add better logging to starting collate-letter-pdfs-to-be-sent
This will help us better understand how far through the task has got if
it gets interrupted halfway (as was the case this morning and we
struggled to understand).
2020-07-10 11:32:03 +01:00
Leo Hemsted
8b4a954f2b move schema import in to function level
solves `AttributeError: 'DummySession' object has no attribute 'query'`

if you don't do this you get really hard to diagnose errors in unrelated
tests, due to strange import order problems or something
2020-07-09 20:10:34 +01:00
Leo Hemsted
eca37d0853 add send_broadcast_message task
task takes a brodcast_message_id, and makes a post to the cbc-proxy
for now, hardcode the url to the notify stub. the stub requires template
as the admin/api get it, so use the marshmallow schema to json dump it.
Note - this also required us to tweak the BroadcastMessage.serialize
function so that it converts uuids in to ids - flask's jsonify function
does that for free but requests.post doesn't sadly.

if the request fails (either 4xx or 5xx) just raise an exception and let
it bubble up for now - in the future we'll add retry logic
2020-07-09 18:23:30 +01:00
Pea M. Tyczynska
9186083ea7 Merge pull request #2796 from alphagov/split-letters-into-zips-based-on-postage
Split letters into zips based on postage
2020-07-08 11:49:21 +01:00
Pea Tyczynska
61de908c5d Simplify putting letters in right postage folders 2020-07-02 16:26:44 +01:00
Leo Hemsted
6318cd2a84 remove whitespace from log line
multi line strings don't handle indentation
2020-06-25 14:25:04 +01:00
Pea Tyczynska
f3c7c098d6 Include postage in zip folder filenames when sending letters
Also split letters into zips in 4 categories based on their postage.

Also have a separate count for zipfiles in each postage category.
2020-06-24 14:59:10 +01:00
Pea Tyczynska
9db6fc83f9 Split letters into zip files based on postage class
We will split them into three categories:
- first class
- second class
- international - this zip file will have letters for both europe
and rest-of-world postage classes
2020-06-22 19:40:37 +01:00
Leo Hemsted
c4dc0f64c5 dd sqlalchemy connection metrics for celery tasks
grab the worker app name and task name rather than the web host and
endpoint. also add a fallback for if we're not in a web request or a
celery task. I think that'll probably happen when we use alembic, or if
we do things from within flask shell
2020-06-12 14:52:22 +01:00
Leo Hemsted
6e32ca5996 add prometheus metrics for connection pools (both web and sql)
add the following prometheus metrics to keep track of general app
instance health.

 # sqs_apply_async_duration

how long does the actual SQS call (a standard web request to AWS) take.
a histogram with default bucket sizes, split up per task that was
created.

 # concurrent_web_request_count

how many web requests is this app currently serving. this is split up
per process, so we'd expect multiple responses per app instance

 # db_connection_total_connected

how many connections does this app (process) have open to the database.
They might be idle.

 # db_connection_total_checked_out

how many connections does this app (process) have open that are
currently in use by a web worker

 # db_connection_open_duration_seconds

a histogram per endpoint of how long the db connection was taken from
the pool for. won't have any data if a connection was never opened.
2020-06-12 14:52:22 +01:00
David McDonald
7bc02ac26e Add better logging to understand callback delivery cases
In particular, we want to know how often callbacks arrive before the
notification being persisted
2020-06-04 16:00:43 +01:00
Pea Tyczynska
b81c7dd5ee Put status codes in logs to see if lack of detailed status is
us not recognising a code or provider not having sent the detailed
status.

It seems like Firetext is sometimes sending us permanent-failure
without detailed status. It could be due to:
- them really not sending any detailed status
- them sending a status code we don't recognise
- them sending 000 code that means 'no errors', which we ignore

To see which one it is, and to debug such issues quicker in the
future, this PR adds status and detailed status codes to the logs.
2020-06-02 13:26:41 +01:00
Pea Tyczynska
c96142ba5e Change function and variable names for readability and consistency 2020-06-01 12:44:49 +01:00
Pea Tyczynska
a4b942cf6c Log detailed sms delivery status for mmg from process_sms_client_response task.
Also log detailed delivery status for firetext in the same place in addition
to it being logged from notifications_dao.

Logging detailed delivery statuses will help us see why messages
fail to deliver. In the future we could persist detailed delivery
status in the database.
2020-06-01 12:44:49 +01:00
Pea Tyczynska
5d6f2da155 Rename task from create_letters_pdf to get_pdf_for_templated_letter
In a separate PR we will have to delete vestigial create_letters_pdf
tasks that now only redirects to get_pdf_for_templated_letter.
2020-05-11 13:33:05 +01:00
Pea Tyczynska
879d15b736 Test logging and error message 2020-05-11 13:33:04 +01:00
Pea Tyczynska
3a00c19390 Polish and test the small task that updates billable units for letter 2020-05-11 13:32:09 +01:00
Pea Tyczynska
24a89c1c19 Modify tasks for getting letter pdf and updating billable units
So that they talk with new template preview task for pdf creation
2020-05-11 13:30:59 +01:00
Chris Hill-Scott
85fc601886 Tell template preview to allow international letters
If a service has permission to send international letters then it should
tell template preview, so that template preview knows what rule to
apply when it’s validating the address of the letter.

Depends on:
- [ ] https://github.com/alphagov/notifications-template-preview/pull/445
2020-05-01 14:37:24 +01:00
Chris Hill-Scott
e366ad29ae Bump utils to 38.0.0
Brings in breaking change to how the `RecipientCSV` class should be
instantiated.
2020-05-01 14:37:23 +01:00
David McDonald
7a9c756117 Add in lots of logging to our reporting tasks
We've seen only some of these reporting tasks happen but with no log
messages to indicate what happened and no app crashes. This hopefully
will give us a better picture of a timeline.

Note, I've tried to make our message format very consistent and good for
searching for in kibana so I've changed that across this whole file for
consistency.
2020-04-28 10:38:53 +01:00
Chris Hill-Scott
26793899d4 Merge pull request #2814 from alphagov/search-letters
Let users search for letters
2020-04-27 15:06:32 +01:00
Leo Hemsted
bd345d0390 Merge pull request #2818 from alphagov/restart-periodic-worker
restart reporting worker after each task
2020-04-27 11:03:07 +01:00
Leo Hemsted
ad419f7592 restart reporting worker after each task
The reporting worker tasks fetch large amounts of data from the db, do
some processing then store back in the database. As the reporting worker
only processes the create nightly billing/stats table tasks, which
aren't high performance or high volume, we're fine with the performance
hit from restarting the worker between every task (which based on
limited local testing takes about a second or so).

This causes some real funky shit with the app_context (used for
accessing current_app.logger). To access flask's global state we use the
standard way of importing `from flask import current_app`. However,
processes after the first one don't have the current_app available on
shut down (they're fine during the actual task running), and are unable
to call `with current_app.app_context()` to create it. They _are_ able
to call `with app.app_context()` to create it, where `app` is the
initial app that we pass in to `NotifyCelery.init_app`.

NotifyCelery.init_app is only called once, in the master process - I
think the application state is then stored and passed to the celery
workers. But then it looks like the teardown might clear it, but it
never gets set up again for the new workers? Unsure.

To fix this, store a copy of the initial flask app on the NotifyCelery
object and then use that from within the shutdown signal logging
function.

Nothing's ever easy ¯\_(ツ)_/¯
2020-04-24 12:28:25 +01:00
Chris Hill-Scott
11008af96a Remove outdated comment 2020-04-23 15:35:25 +01:00
Leo Hemsted
d59b2f4cc9 make create_letters_pdf always use right queue
create-letters-pdf-tasks is for contacting template preview.
letters-tasks is for doing other things around letters (including putting things on template preview queue)
2020-04-22 16:07:51 +01:00