Commit Graph

787 Commits

Author SHA1 Message Date
Rebecca Law
89a8d8912a Set postage and international for letters uploaded with a CSV
If the letter is outside of the United Kingdom, set the postage and international flag.
2020-08-10 09:58:49 +01:00
Rebecca Law
4a9f9e4b17 Remove the template_postage parameter for persist_notification
It was confusing to have 2 differnt postage parameters.
2020-08-06 07:35:13 +01:00
Rebecca Law
10fe7d9fe8 Add postage for send-one-off letters.
The postage is set to europe or rest-of-world for international letters, otherwise the template postage is used.

Also set international for letters.
2020-08-03 14:01:59 +01:00
Rebecca Law
dd126df122 We have a scheduled task to check that all the jobs have completed, this will catch if an app is shut down and the job is complete yet, we only wait 10 seconds before forcing the app to shut down.
The task was raising a JobIncompleteError, yet it's not an error the task is performing it's task correctly and calling the appropriate task to restart the job.
Also used apply_sync to create the task instead of send_task.
2020-07-22 17:00:20 +01:00
David McDonald
ef551247b5 Add better logging to starting collate-letter-pdfs-to-be-sent
This will help us better understand how far through the task has got if
it gets interrupted halfway (as was the case this morning and we
struggled to understand).
2020-07-10 11:32:03 +01:00
Leo Hemsted
8b4a954f2b move schema import in to function level
solves `AttributeError: 'DummySession' object has no attribute 'query'`

if you don't do this you get really hard to diagnose errors in unrelated
tests, due to strange import order problems or something
2020-07-09 20:10:34 +01:00
Leo Hemsted
eca37d0853 add send_broadcast_message task
task takes a brodcast_message_id, and makes a post to the cbc-proxy
for now, hardcode the url to the notify stub. the stub requires template
as the admin/api get it, so use the marshmallow schema to json dump it.
Note - this also required us to tweak the BroadcastMessage.serialize
function so that it converts uuids in to ids - flask's jsonify function
does that for free but requests.post doesn't sadly.

if the request fails (either 4xx or 5xx) just raise an exception and let
it bubble up for now - in the future we'll add retry logic
2020-07-09 18:23:30 +01:00
Pea M. Tyczynska
9186083ea7 Merge pull request #2796 from alphagov/split-letters-into-zips-based-on-postage
Split letters into zips based on postage
2020-07-08 11:49:21 +01:00
Pea Tyczynska
61de908c5d Simplify putting letters in right postage folders 2020-07-02 16:26:44 +01:00
Leo Hemsted
6318cd2a84 remove whitespace from log line
multi line strings don't handle indentation
2020-06-25 14:25:04 +01:00
Pea Tyczynska
f3c7c098d6 Include postage in zip folder filenames when sending letters
Also split letters into zips in 4 categories based on their postage.

Also have a separate count for zipfiles in each postage category.
2020-06-24 14:59:10 +01:00
Pea Tyczynska
9db6fc83f9 Split letters into zip files based on postage class
We will split them into three categories:
- first class
- second class
- international - this zip file will have letters for both europe
and rest-of-world postage classes
2020-06-22 19:40:37 +01:00
Leo Hemsted
c4dc0f64c5 dd sqlalchemy connection metrics for celery tasks
grab the worker app name and task name rather than the web host and
endpoint. also add a fallback for if we're not in a web request or a
celery task. I think that'll probably happen when we use alembic, or if
we do things from within flask shell
2020-06-12 14:52:22 +01:00
Leo Hemsted
6e32ca5996 add prometheus metrics for connection pools (both web and sql)
add the following prometheus metrics to keep track of general app
instance health.

 # sqs_apply_async_duration

how long does the actual SQS call (a standard web request to AWS) take.
a histogram with default bucket sizes, split up per task that was
created.

 # concurrent_web_request_count

how many web requests is this app currently serving. this is split up
per process, so we'd expect multiple responses per app instance

 # db_connection_total_connected

how many connections does this app (process) have open to the database.
They might be idle.

 # db_connection_total_checked_out

how many connections does this app (process) have open that are
currently in use by a web worker

 # db_connection_open_duration_seconds

a histogram per endpoint of how long the db connection was taken from
the pool for. won't have any data if a connection was never opened.
2020-06-12 14:52:22 +01:00
David McDonald
7bc02ac26e Add better logging to understand callback delivery cases
In particular, we want to know how often callbacks arrive before the
notification being persisted
2020-06-04 16:00:43 +01:00
Pea Tyczynska
b81c7dd5ee Put status codes in logs to see if lack of detailed status is
us not recognising a code or provider not having sent the detailed
status.

It seems like Firetext is sometimes sending us permanent-failure
without detailed status. It could be due to:
- them really not sending any detailed status
- them sending a status code we don't recognise
- them sending 000 code that means 'no errors', which we ignore

To see which one it is, and to debug such issues quicker in the
future, this PR adds status and detailed status codes to the logs.
2020-06-02 13:26:41 +01:00
Pea Tyczynska
c96142ba5e Change function and variable names for readability and consistency 2020-06-01 12:44:49 +01:00
Pea Tyczynska
a4b942cf6c Log detailed sms delivery status for mmg from process_sms_client_response task.
Also log detailed delivery status for firetext in the same place in addition
to it being logged from notifications_dao.

Logging detailed delivery statuses will help us see why messages
fail to deliver. In the future we could persist detailed delivery
status in the database.
2020-06-01 12:44:49 +01:00
Pea Tyczynska
5d6f2da155 Rename task from create_letters_pdf to get_pdf_for_templated_letter
In a separate PR we will have to delete vestigial create_letters_pdf
tasks that now only redirects to get_pdf_for_templated_letter.
2020-05-11 13:33:05 +01:00
Pea Tyczynska
879d15b736 Test logging and error message 2020-05-11 13:33:04 +01:00
Pea Tyczynska
3a00c19390 Polish and test the small task that updates billable units for letter 2020-05-11 13:32:09 +01:00
Pea Tyczynska
24a89c1c19 Modify tasks for getting letter pdf and updating billable units
So that they talk with new template preview task for pdf creation
2020-05-11 13:30:59 +01:00
Chris Hill-Scott
85fc601886 Tell template preview to allow international letters
If a service has permission to send international letters then it should
tell template preview, so that template preview knows what rule to
apply when it’s validating the address of the letter.

Depends on:
- [ ] https://github.com/alphagov/notifications-template-preview/pull/445
2020-05-01 14:37:24 +01:00
Chris Hill-Scott
e366ad29ae Bump utils to 38.0.0
Brings in breaking change to how the `RecipientCSV` class should be
instantiated.
2020-05-01 14:37:23 +01:00
David McDonald
7a9c756117 Add in lots of logging to our reporting tasks
We've seen only some of these reporting tasks happen but with no log
messages to indicate what happened and no app crashes. This hopefully
will give us a better picture of a timeline.

Note, I've tried to make our message format very consistent and good for
searching for in kibana so I've changed that across this whole file for
consistency.
2020-04-28 10:38:53 +01:00
Chris Hill-Scott
26793899d4 Merge pull request #2814 from alphagov/search-letters
Let users search for letters
2020-04-27 15:06:32 +01:00
Leo Hemsted
bd345d0390 Merge pull request #2818 from alphagov/restart-periodic-worker
restart reporting worker after each task
2020-04-27 11:03:07 +01:00
Leo Hemsted
ad419f7592 restart reporting worker after each task
The reporting worker tasks fetch large amounts of data from the db, do
some processing then store back in the database. As the reporting worker
only processes the create nightly billing/stats table tasks, which
aren't high performance or high volume, we're fine with the performance
hit from restarting the worker between every task (which based on
limited local testing takes about a second or so).

This causes some real funky shit with the app_context (used for
accessing current_app.logger). To access flask's global state we use the
standard way of importing `from flask import current_app`. However,
processes after the first one don't have the current_app available on
shut down (they're fine during the actual task running), and are unable
to call `with current_app.app_context()` to create it. They _are_ able
to call `with app.app_context()` to create it, where `app` is the
initial app that we pass in to `NotifyCelery.init_app`.

NotifyCelery.init_app is only called once, in the master process - I
think the application state is then stored and passed to the celery
workers. But then it looks like the teardown might clear it, but it
never gets set up again for the new workers? Unsure.

To fix this, store a copy of the initial flask app on the NotifyCelery
object and then use that from within the shutdown signal logging
function.

Nothing's ever easy ¯\_(ツ)_/¯
2020-04-24 12:28:25 +01:00
Chris Hill-Scott
11008af96a Remove outdated comment 2020-04-23 15:35:25 +01:00
Leo Hemsted
d59b2f4cc9 make create_letters_pdf always use right queue
create-letters-pdf-tasks is for contacting template preview.
letters-tasks is for doing other things around letters (including putting things on template preview queue)
2020-04-22 16:07:51 +01:00
Chris Hill-Scott
9047b273da Save whole letter address into the to field
At the moment we’re not consistent:

Precompiled (API and one-off):
`to` has the whole address
`normalised_to` has nothing

Templated (API, CSV and one off):
`to` has the first line of the address
`normalised_to` has nothing

This commit makes us consistently store the whole address in the `to`
field. We think that people might want to search by postcode, not just
first line of the address.

This commit also starts to populate the normalised_to field with the
address lowercased and with all spaces removed, to make it easier to
search on.
2020-04-22 10:06:25 +01:00
Pea M. Tyczynska
850a56ab04 Merge pull request #2803 from alphagov/firetext-response-codes
Use firetext response code to see if temporary or permanent failure if available
2020-04-21 10:50:46 +01:00
Pea Tyczynska
7a89b1b835 Fix bug where preview for templated letters would not show 2020-04-20 15:35:37 +01:00
Pea Tyczynska
a4507dbb3f Use firetext response code to see if temporary or permanent failure if available 2020-04-17 10:58:25 +01:00
Pea Tyczynska
91fe68eed4 WIP read firetext response codes 2020-04-17 10:58:25 +01:00
David McDonald
9b1b9ea75b Merge pull request #2798 from alphagov/delete-notifications-older-than-retention
Delete notifications older than retention
2020-04-15 12:14:07 +01:00
Chris Hill-Scott
36e61272c5 Save the first non-empty line as recipient
Since we now allow the address to be populated from any three lines, we
can’t guarantee that the recipient will be in the `addressline1` field.
2020-04-09 18:19:53 +01:00
Chris Hill-Scott
264fbed04e Refactor postcode validation to use utils
We don’t need to reformat the postcode here once template preview takes
care of it when rendering the PDF.

It’s better (and less code) to store what people give us, so we give
them back the same thing.
2020-04-09 18:19:53 +01:00
David McDonald
085e3a8435 Make deleting of notifications to be sequential
Based on this https://github.com/alphagov/notifications-api/pull/2788
where some concerns were raised. This should be a quicker fix to get our
the deletions to run sequentially for all the notification types. Note,
email is first as most important as makes up the larger numbers (we
wouldn't want it to start with SMS, fail half way for a reason that only
affects SMS and for that to affect the email deletion).

We hope that by running sequentially we will reduce conflicts writing to
the same index and this will speed up the total time it takes to finish
deleting all notification types older than their retention time. There
is a risk that whilst quicker per job, as they now run sequentially
rather than potentially overlapping, they will take longer overall. We
will need to monitor to see.
2020-04-07 17:03:17 +01:00
Chris Hill-Scott
8c8c8b6328 Look in all parts of a letter template to find placeholders
Text messages have placeholders in their body.

Emails have them in their subject line too.

Letters have them in their body, subject line and contact block.

We were only looking in the the body and subject when processing a job,
therefore the thing assembling the letter was not looking in all the
CSV columns it needed to, because it hadn’t been told about any
placeholders in the contact block.

Fixing this means always making sure we use the correct `Template`
instance for the type of template we’re dealing with. Which we were
already doing in a different part of the codebase. So it makes sense to
reuse that.

Turns out we fixed the same bug for email subjects over 3 years ago:
3ed97151ee
2020-04-07 10:41:16 +01:00
Katie Smith
e386d2ac38 Merge pull request #2792 from alphagov/delete-old-letter-task
Delete old 'process-virus-scan-passed-task'
2020-04-03 09:35:35 +01:00
Katie Smith
6ac89c9a2f Delete old 'process-virus-scan-passed-task'
This has been replaced by a new task, `sanitise-letter`, to this deletes
all the code in the old task and ensures that when antivirus is not
enabled locally we are calling the new task.
2020-04-02 14:52:15 +01:00
Katie Smith
62b11bc61e Delete delete_dvla_response_files_older_than_seven_days task
This was not being used.
2020-04-02 14:49:47 +01:00
Katie Smith
bfd40a843c Delete send-scheduled-notifications task
This was added a few years ago, but never used.
2020-04-02 14:45:18 +01:00
David McDonald
98d8388805 Add runbook link to ticket 2020-04-02 09:42:46 +01:00
David McDonald
b28fffc330 Refactor raise_alert_if_letter_notifications_still_sending
Move finding of letter logic into a separate method so it can be unit
tested rather than having to test it by checking the contents of a
zendesk ticket api call.

This will enable us to change the zendesk api ticket call message
without needing to edit lots of tests.
2020-04-02 09:36:56 +01:00
Katie Smith
8b8e736e49 Add retries to process_sanitised_letter task
This task didn't have retries before, based on the assumption that if
the task failed it was likely to be due to a Boto error, so retrying
would not help because a file was probably not in the expected bucket.
During an incident with the database, we had some letters that were
stuck in the `pending-virus-check` state because this task failed.

This change adds retries to the task if there was an Exception that was
not a `BotoClientError`.
2020-03-30 17:28:01 +01:00
David McDonald
c1ba3fb1be Merge pull request #2777 from alphagov/add-runbook-link
Add runbook link to resolve letters pending virus scan
2020-03-30 13:50:30 +01:00
David McDonald
87cb02fa1d Add runbook link to resolve letters pending virus scan 2020-03-30 12:07:05 +01:00
Rebecca Law
5d0830d7f7 Rename function for clarity 2020-03-27 15:48:54 +00:00