Commit Graph

873 Commits

Author SHA1 Message Date
Rebecca Law
1a203b5c04 Add more logging for process_job 2019-10-01 13:05:05 +01:00
Rebecca Law
44b7b36acd Added a command to process a row from a job. 2019-09-26 14:19:09 +01:00
Rebecca Law
a1863fa419 Update all calls to get_folder_name to include the parameter name.
Use created_at date of the notification for precompiled letters.
2019-09-25 14:40:09 +01:00
Pea Tyczynska
1279a46b8b Don't log address redaction failure when letter sent with test key 2019-09-17 15:55:26 +01:00
Leo Hemsted
99eb17fc29 Merge pull request #2610 from alphagov/get-pdf-contents-via-api
add api endpoint to get pdf for letter
2019-09-17 14:55:34 +01:00
Katie Smith
081543a2a9 Refactor out function to get page count
This has been moved to the letters utils file since it will be used in
more than one place. The notification parameter has been removed so that
the function can be used when we don't have a notification id.
2019-09-12 14:58:51 +01:00
Leo Hemsted
52f7620772 create pdfs for test templated letters
previously, we didn't create templated letters, and just marked them as
delivered straight away. However, we may need to return PDFs for these
letters, so we should create them the same as live letters. Then update
the functions so that they know where to look for these letters.
2019-09-11 15:02:12 +01:00
Pea Tyczynska
fecd7b5728 Copy original file tp redaction_failure folder when redaction fails 2019-09-10 15:10:18 +01:00
Pea Tyczynska
8460147dfa Handle both new and old response type from template preview's
sanitise endpoint

Fix tests so they accept new response handling
2019-09-06 13:18:21 +01:00
Leo Hemsted
8f13697cf1 Revert "trigger nightly delete tasks from the create notification status task"
This reverts commit 58f24a0a83.
2019-08-19 16:06:25 +01:00
Leo Hemsted
36dd750637 split up reporting tasks in to separate tasks per day
to try and speed up overall time by parallelising
2019-08-19 16:06:25 +01:00
Leo Hemsted
92d78956be Merge pull request #2592 from alphagov/reporting-worker
Add reporting worker
2019-08-15 17:22:27 +01:00
Leo Hemsted
e5c76ffda7 reduce days to process from 10 to 4
to try and speed it up temporarily.
2019-08-15 17:06:38 +01:00
Leo Hemsted
58f24a0a83 trigger nightly delete tasks from the create notification status task
the nightly tasks need to run after the create nightly notification
status task - so that test notifications are still there to record
stats for, and to stop the risk of deleting notificaitons part-way
through recording stats for them.
2019-08-14 18:04:45 +01:00
Rebecca Law
ae1bc54f9e Update NotificationTechnicalFailureException
- Change the NotificationTechnicalFailureException so that it only inherits from Exception.
- The notify_celery task should create the logging message on failure.
- Fix unit tests
- Remove named parameter when raising exception.
2019-08-12 16:51:39 +01:00
Katie Smith
355fb07eb2 Revert "Change email status to permanent-failure if SES raises InvalidParameterValue"
This reverts commit 51716fbaf8.

Instead of relying on catching SES errors we will convert all emails to
punycode before sending instead.
2019-08-12 13:51:24 +01:00
Katie Smith
51716fbaf8 Change email status to permanent-failure if SES raises InvalidParameterValue
If SES raised an `InvalidParameterValue` error (because an email address
was wrong) we were logging an exception and setting the email status to
`technical-failure`. We now set it to `permanent-failure` instead and
change the log level to `info` - setting it to `permanent-failure` means
that people will know not to retry the message.
2019-08-12 10:24:59 +01:00
Katie Smith
e449e234db Retry deliver_sms task immediately if sending fails
If the `deliver_sms` catches an exception when trying to send an SMS, we
want the first retry to happen immediately (because we will have
switched providers), then every retry after that to happen at the
standard intervals.
2019-08-08 09:34:38 +01:00
Rebecca Law
996dcdd88c Increase the number of days we rebuild the tables for 2019-07-18 16:45:27 +01:00
Pea Tyczynska
e033f3300b Degrade MaxRetriesExceededError to warning status in logger
This is because that error is caused by our providers and we
cannot do anything about it but it can make our logs hard to read
and actionable errors harder to spot
2019-06-27 14:55:10 +01:00
Katie Smith
a790acc091 Create a Zendesk ticket for letters in the wrong state
This creates a Zendesk ticket if either the
`check_precompiled_letter_state` or `check_templated_letter_state` tasks
fail.
2019-06-18 10:58:58 +01:00
Katie Smith
c518f6ca76 Add scheduled task to find old letters which still have 'created' status
Added a scheduled task to run once a day and check if there were any
letters from before 17.30 that still have a status of 'created'. This
logs an exception instead of trying to fix the error because the fix
will be different depending on which bucket the letter is in.
2019-06-18 10:58:58 +01:00
Katie Smith
a2f324ad7e Add scheduled task to find precompiled letters in wrong state
Added a task which runs twice a day on weekdays and checks for letters that have
been in the state of `pending-virus-check` for over 90 minutes. This is
just logging an exception for now, not trying to fix things, since we
will need to manually check where the issue was.
2019-06-18 10:58:58 +01:00
Katie Smith
3d01276ce2 Log exception and set precompiled letter to tech-failure if S3 errors
The `process_virus_scan_passed` task now catches S3 errors - if these
occur, it logs an exception and puts the letter in a `technical-failure`
state. We don't retry the task, because the most common reason for
failure would be the letter not being in the expected S3 bucket, in
which case retrying would make no difference.
2019-06-18 10:58:58 +01:00
Leo Hemsted
5045590d75 allow you to pass in date to send perf stats
make it easier to replay sending data for a day if it failed the first
time round
2019-06-11 13:57:17 +01:00
Leo Hemsted
09888f7479 ensure cronitor decorator is inside the notify_task wrapper
the celery decorator should always be on the outside so that all other
decorators will be captured within the celery task. We had problems
with cronitor not reporting, and only for this task.
2019-06-03 11:46:07 +01:00
Rebecca Law
3374e03ce9 Prepare to stop inserting NotificationHistory at the time of inserting a notificaiton.
Need to remove foreign key to complaints.
Make sure if getting Notification.id we look to both tables.
2019-05-21 16:08:18 +01:00
Rebecca Law
e3ee99e70d Reduce the number of days to recalculate billing. It's not necessary to calculate longer than 4 days. 2019-05-15 14:40:53 +01:00
Katie Smith
c02b7edb92 Bump utils to bring in changes to RecipientCSV rows
Bumped utils to version 31.2.5, which changes when the rows of a
RecipientCSV get created. Switched to using `.get_rows()` from
RecipientCSV (a generator) instead of the `.rows` property (which builds
a list of the rows in memory).
2019-04-25 10:58:19 +01:00
Rebecca Law
1c68e0f565 Remove unused method.
last_n_days was only being used in a test.
2019-04-12 10:26:46 +01:00
Rebecca Law
4ce2b9eaba The rstrip was not working for all file names so this changes it to a replace. 2019-04-08 12:04:14 +01:00
Toby Lorne
0022923bd0 Add Cronitor decorator collate-letter-pdfs-for-day
This celery task was not decorated with the cronitor decorator so never
checked in with cronitor.

Adding the decorator will ensure this task is monitored.

The requisite cronitor key is in the credentials repository already.

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
2019-04-05 10:26:18 +01:00
Rebecca Law
dc8159104e Update letter_raise_alert_if_no_ack_file_for_zip for new DVLA file format
When we send a zip file of letters to DVLA we expect them to send back an acknowledgement of those files.
Previously they named the files like NOTIFY.20180202091254.ACK.TXT and the contents would contain the name of the zip file we sent with a date of when they got it.
They have updated this format to mirror the format of the zip file because there was an instance where they sent 2 files of the same name so the later overwrote the first.
Since the name matches our name, there is no need to get the file from S3 but just compare file names.
2019-04-03 11:03:42 +01:00
Leo Hemsted
1dc084be54 fix nightly ft stats tables task to respect BST
the create_nightly_notification_status task runs at 00:30am UK time,
however this means that in summer datetime.today() will return the
wrong date as the server (which runs on UTC) will run the task at
23:30 (populating the wrong row in the table).

fix this to use nice tz aware functions
2019-04-02 15:15:07 +01:00
Leo Hemsted
3739d9055d clean up usage of dates/datetimes in performance platform tasks
* call variables unambiguous things like `start_time` or `bst_date` to
  reduce risk of passing in the wrong thing
* simplify the count_dict object - remove nested dict and start_date
  fields as superfluous
* use static datetime objects in tests rather than calculating them
  each time
2019-04-02 11:49:20 +01:00
Rebecca Law
1456aa7789 Fix for performance platform updates.
Changed the query to get the performance platform stats from ft_notification_status. But the date used for the query needed to be a date, not datetime so the equality worked.
2019-04-01 12:03:57 +01:00
Rebecca Law
4105f6638e Split the update letter statuses from counting the daily sorted/unsorted numbers.
We need to back fill the daily_sorted_count tables, so we need to iterate through all the files. No need to update the notification status. So this task has been separated out.
2019-03-25 15:30:48 +00:00
Leo Hemsted
9da9968028 downgrade error to info 2019-03-22 14:06:45 +00:00
Leo Hemsted
6fa7f0290d ignore case in the cost_threshold in dvla response files
we failed when we received UNSORTED instead of Unsorted
2019-03-22 12:07:08 +00:00
Leo Hemsted
b288031adb add a hash of letter filenames to the dvla zip file name
if we partially retry a day, we would create new zip files, containing
different letters (if some were processed succesfully). We need these
files to have different filenames to earlier zip files so that we can
avoid overwriting log data in zips_sent.

Hashing the filename means that we'll only overwrite if it was the same
file containing the same content.
2019-03-21 15:40:24 +00:00
Leo Hemsted
334eb473ed separate batch num from date
DVLA don't care about the naming conventions of zip files, other than
it must start with `NOTIFY.` and end with `.ZIP`. So lets format the
date in a more readable way, and separate it from the batch number
2019-03-20 12:15:25 +00:00
Leo Hemsted
1a4baf4283 pass upload filename to notify-ftp
previously ftp would name the files itself by giving them a timestamp
when uploading. we ran into issues with tasks being picked up multiple
times and as such, uploading duplicate files. By naming the file before
creating the task, we can avoid this issue.

Files are now named `NOTIFY.YYYYMMDD######.ZIP` where the number is a
counter that increments with each task we've issued in that run of
collate-letter-pdfs-for-day
2019-03-19 13:48:17 +00:00
Leo Hemsted
2f94e1d9bc lower provider switch threshold from 20% to 30%
make it less likely to switch on slow messages to allow more manual
control of provider balance
2019-03-14 16:11:59 +00:00
Rebecca Law
1625371106 Merge pull request #2381 from alphagov/inbound-sms-retention
Inbound sms now deletes according to data retention
2019-03-08 10:58:01 +00:00
Alexey Bezhan
6f5822ae5b Downgrade log level for missing notifications in SES receipt
The timestamps available in the SES receipt don't always correspond
to the time the notification has been sent. We've seen callbacks with
a current timestamp in both 'mail' and 'bounce' objects that referenced
a notification sent a week ago, which means we can't rely on it to skip
archived notifications.

One possible approach would be to look up the notification reference in
the notification_history table, but this goes against our plans to stop
relying on it in the future.

This changes the SES receipts logic to retry missing notifications once
(if the callback timestamp is within the last 5 minutes the task will
retry after a 5 minute delay) to capture callbacks arriving before the
notification reference has been persisted to the DB. Otherwise, we log
the missing notification as a warning instead of error.
2019-03-06 11:35:32 +00:00
Leo Hemsted
653f1ab6b9 stub out antivirus in dev
antivirus is sometimes tough to get running locally - now in dev
antivirus is skipped unless `ANTIVIRUS_ENABLED=1` is set on the command
line. on all other environments it is always enabled.
2019-02-27 10:59:31 +00:00
Leo Hemsted
38f0ea6cca remove functions to not talk about 7 days
remind us that data retention is flexible
2019-02-26 17:57:35 +00:00
Leo Hemsted
f00bfdfe85 move slow sms provider threshold from 10% to 20%
provider switching is a process that can happen as often as we like
without disrupting the flow of the system - however, there are some
reasons why we might not want to switch. One problem we've seen is
when a provider is having an issue, we might switch away from them
manually only for the app to automatically switch back to them again
and again.

Long term we'd like to have a system better suited for sharing the load
equally between our two sms providers, but short term, by increasing
the threshold for switching from 10% (of messages sent are slow) to
20%, we hope to make switching happen less often.

A notification is considered slow if it was sent in the last ten
minutes, on the current provider, and is either

* still in sending or pending after 4 minutes
* in delivered, but took at least 4 minutes to send
2019-02-25 14:29:39 +00:00
Alexey Bezhan
c2e15d4ee2 Allow retry exception to propagate from ses callback task
Celery `self.retry` raises an exception to communicate that the task
needs to be retried. Since our ses task is wrapped in a catch-all
except block it logs that exception as an error before retrying.

Handling Retry class separately allows us to raise it without logging
the traceback.
2019-02-25 13:25:50 +00:00
Alexey Bezhan
2932b44eb8 Add retries for SES callbacks for recent notifications
We've seen errors caused by what we suspect is a race condition when
SES callback processing tries to look up the notification before the
sender worker has saved notification reference from the SES POST
response to the database.

This adds a retry for SES callback task if the notification was not
found and the message is less than 10 minutes old and removes the
error log message for notifications older than 3 days (since they
might no longer exist in the notifications table and would've been
marked as failure by then either way).

In order to be able to call retry and silence the error log based on
notification time this change inlines `process_ses_response` and
`update_notification_by_reference` functions into the celery task.
It also removes a lot of defensive error-handling that doesn't appear
to have been triggered in the last few months (for things like missing
keys in SES callback data).
2019-02-25 10:36:37 +00:00