Commit Graph

61 Commits

Author SHA1 Message Date
stvnrlly
b50cb4712f tz utility swap and many test updates 2022-11-10 12:33:25 -05:00
Ben Thorner
8219b3c032 Remove non/crown indicator in letter filenames
This is not required by DVLA and since [1] we no longer care about
the end of letter filenames when collating them, so removing it is
safe to do. Note that the name of the ZIP files of collated letters
is based on a hash of the filenames, which needed updating in tests.

Before merging this we need to do a test run in Staging, so DVLA can
check that a mixture of the old / new filenames won't cause issues.

[1]: https://github.com/alphagov/notifications-api/pull/3172
2021-03-18 13:05:12 +00:00
Ben Thorner
d17a6c8dbe DRY-up pre-existing method to get PDF body/meta
This is possible now that the "find_..." method returns an S3 object.
2021-03-16 12:53:14 +00:00
Ben Thorner
c76e789f1e Reduce extra S3 ops when working with letter PDFs
Previously we did some unnecessary work:

- Collate task. This had one S3 request to get a summary of the object,
which was then used in another request to get the full object. We only
need the size of the object, which is included in the summary [1].

- Archive task. This had one S3 request to get a summary of the object,
which was then used to make another request to delete it. We still need
both requests, but we can remove the S3.Object in the middle.

[1]: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#objectsummary
2021-03-16 12:53:13 +00:00
Leo Hemsted
6784ae62a6 Raise Exception if letter PDF not in S3
Previously, the function would just return a presumed filename. Now that
it actually checks s3, if the file doesn't exist it'll raise an
exception. By default that's a StopIteration at the end of the bucket
iterator, which isn't ideal as this will get supressed if the function
is called within a generator loop further up or anything.

There are a couple of places where we expect the file may not exist, so
we define a custom exception to rescue specifically here. I did consider
subclassing boto's ClientError, but this wasn't straightforward as the
constructor expects to know the operation that failed, which for me is a
signal that it's not an appropriate (re-)use of the class.
2021-03-15 17:18:11 +00:00
Ben Thorner
b43a367d5f Relax lookup of letter PDFs in S3 buckets
Previously we generated the filename we expected a letter PDF to be
stored at in S3, and used that to retrieve it. However, the generated
filename can change over the course of a notification's lifetime e.g.
if the service changes from crown ('.C.') to non-crown ('.N.').

The prefix of the filename is stable: it's based on properties of the
notification - reference and creation - that don't change. This commit
changes the way we interact with letter PDFs in S3:

- Uploading uses the original method to generate the full file name.
The method is renamed to 'generate_' to distinguish it from the new one.

- Downloading uses a new 'find_' method to get the filename using just
its prefix, which makes it agnostic to changes in the filename suffix.

Making this change helps to decouple our code from the requirements DVLA
have on the filenames. While it means more traffic to S3, we rely on S3
in any case to download the files. From experience, we know S3 is highly
reliable and performant, so don't anticipate any issues.

In the tests we favour using moto to mock S3, so that the behaviour is
realistic. There are a couple of places where we just mock the method,
since what it returns isn't important for the test.

Note that, since the new method requires a notification object, we need
to change a query in one place, the columns of which were only selected
to appease the original method to generate a filename.
2021-03-15 13:55:44 +00:00
Ben Thorner
a91fde2fda Run auto-correct on app/ and tests/ 2021-03-12 11:45:45 +00:00
Leo Hemsted
1e928a926a rename sending_date to created_at
we don't name letters based on the day we send them on, rather, the day
we create them on. If we process a letter for a second time for whatever
reason, even if it's a couple of days later, it'll still go in a folder
based on the created_at timestamp. There's still a slight confusion,
however - if the timestamp is after 5:30pm, the folder will be for the
day after. However, still the day after creation, so I think created_at
still makes the most sense.

Remove the term `sending_date` to try and make this relationship more
apparent.
2020-09-21 14:40:22 +01:00
Leo Hemsted
4f1e5ee5a3 remove ignore_folder from get_folder_name
it doesn't make sense to get the folder name if you know you aren't
going to use it.
2020-09-21 14:33:38 +01:00
Leo Hemsted
bb33927b3d rename letter get_folder_name args
`_now`? why would we ever use a different _now? instead say created_at,
because that's what it'll always be set to, even if we're replaying old
letters. We always set the folder name to when the letter was
created_at, or we might not know where to look to find it.

`dont_use_sending_date` doesn't really tell us what might happen if we
don't use it - the answer is we return an empty string. we ignore the
folder entirely. so lets call it that.

Also, remove use of freeze_gun in the tests, to prove that we don't use
the current time in any calculations. Also add an assert to a mock in
the get_pdf_for_templated_letter test, because we were mocking but not
asserting before, so the tests didn't fail when the function signature
changed.
2020-09-21 14:32:57 +01:00
Katie Smith
aee7887c14 Fix the filenames for international precompiled letters
We were determing the filename for precompiled letters before we had
checked if the letters were international. This meant that a letter
could have a filename indicating it was 2nd class, but once we had
sanitised the letter and checked the address we were setting the
notification to international.

This stopped these letters from being picked up to be sent to the DVLA,
since the filename and postage of the letter did not match.

We now regenerate the filename after the letter has been sanitised (and when
we know the postage) and use the updated filename when moving the letter
into the live PDF letters bucket.
2020-09-15 16:17:33 +01:00
Katie Smith
6ac89c9a2f Delete old 'process-virus-scan-passed-task'
This has been replaced by a new task, `sanitise-letter`, to this deletes
all the code in the old task and ensures that when antivirus is not
enabled locally we are calling the new task.
2020-04-02 14:52:15 +01:00
Katie Smith
cc2191c19f Add new tasks for sanitising precompiled letters
Added a task, `sanitise-letter`, that will be called from antivirus when
a letter has passed virus scan. This calls a new task in
template-preview which will sanitise the PDF.

A second new task, `process-sanitised-letter`, will be called from the
template preview task and deals with updating the notification and
moving it to the relevant bucket.
2019-12-16 11:55:09 +00:00
Pea Tyczynska
c2825e10b1 Grab metadata when getting pdf letter preview from S3
Also use this metadata to decide whether preview pages need
overlay or not. So far we have always added overlay when validation
has failed. Now we will only show it when validation failed due to
content being outside of printable area.
2019-10-29 16:19:50 +00:00
Pea Tyczynska
6ee7ac6cac Refactor and harmonise metadata for invalid letters with those sent from admin app 2019-10-16 14:11:50 +01:00
Pea Tyczynska
0b65e75fe9 Format metadata correctly and use MetadataDirective to put new metadata in S3 object 2019-10-14 11:11:32 +01:00
Pea Tyczynska
0a617379c4 Put pdf letter validation failed message in S3 metadata
So it can be used to tell the user why the letter has failed validation
2019-10-11 17:24:48 +01:00
Rebecca Law
a1863fa419 Update all calls to get_folder_name to include the parameter name.
Use created_at date of the notification for precompiled letters.
2019-09-25 14:40:09 +01:00
Rebecca Law
f234e94572 Change the variable name to make a little more sense. 2019-09-25 13:56:10 +01:00
Rebecca Law
702d8fa85f Refactor the code that figures out what folder and filename to use for the letter pdf files.
Now we consistently use the created_at date, so we can always get the right file location and name.

The previous updates to this code were trying to solve the problem if a pdf being created at 17:29, but not ready to upload until 17:31 after the antivirus and validation check.
But in those cases we would have trouble finding the file.
2019-09-25 13:56:10 +01:00
Leo Hemsted
d080106cbe make sure test notifications don't get date subfolders
they just go in the test bucket's root
2019-09-18 10:24:47 +01:00
Leo Hemsted
99eb17fc29 Merge pull request #2610 from alphagov/get-pdf-contents-via-api
add api endpoint to get pdf for letter
2019-09-17 14:55:34 +01:00
Katie Smith
190119f4fe Change get_bucket_name_and_prefix_for_notification to use created_at
The `get_bucket_name_and_prefix_for_notification` function was looking
at the `sent_at` or `updated_at` at time of a notification to see which
bucket it was in. Precompiled letters sent through the admin app don't
have either of these times - they only have a `created_at` time, so this
lets the function check `created_at` time too.
2019-09-12 14:58:51 +01:00
Katie Smith
eb73ddb8b4 Add function to send PDF letter
This function checks various permissions, downloads the PDF from the
transient bucket, creates the notification then moves the letter to the
'normal' bucket.
2019-09-12 14:58:51 +01:00
Katie Smith
081543a2a9 Refactor out function to get page count
This has been moved to the letters utils file since it will be used in
more than one place. The notification parameter has been removed so that
the function can be used when we don't have a notification id.
2019-09-12 14:58:51 +01:00
Leo Hemsted
52f7620772 create pdfs for test templated letters
previously, we didn't create templated letters, and just marked them as
delivered straight away. However, we may need to return PDFs for these
letters, so we should create them the same as live letters. Then update
the functions so that they know where to look for these letters.
2019-09-11 15:02:12 +01:00
Pea Tyczynska
fecd7b5728 Copy original file tp redaction_failure folder when redaction fails 2019-09-10 15:10:18 +01:00
Rebecca Law
a772fb8acf Fix the folder returned for a pdf letter.
Using the created at date for the folder is not always going to work because the pdf created_at date could be just before the cut off date but virus scan and validation has yet to happen. By the time the letters is in the created state, the letter goes into the next days bucket.
It can also happen if the letters is stuck in `pending-virus-scan` and we need to restart the task, and the letters is in a different folder.
2019-09-05 11:47:31 +01:00
Pea Tyczynska
63fe210202 Also show overlay on document download of failed letter
And of course test this new flex
2019-04-11 14:27:12 +01:00
Katie Smith
1d67b55b16 Add endpoint for cancelling letters 2018-12-03 17:51:08 +00:00
Leo Hemsted
3033015754 Merge pull request #2238 from alphagov/letter-500
look in correct bucket for letters sent after 5:30pm
2018-11-27 13:42:43 +00:00
Leo Hemsted
ba9a774e74 look in correct bucket for letters sent after 5:30pm
use the same function that we use when uploading, to ensure
consistency (that fn is also well tested already)
2018-11-27 12:21:41 +00:00
Katie Smith
ff06d120e8 Bump notifications-utils to 3.7.0
Bumped notifications-utils to 3.7.0. Version 3.7.0 includes the
`convert_utc_to_bst` and `convert_bst_to_utc` functions and the
`LETTER_PROCESSING_DEADLINE` constant, so these have been removed from
this repo and anywhere using these has now been updated to get these
from `notifications-utils`.

Also bumped pytest by a patch version to bring in a bug fix.
2018-11-26 12:53:39 +00:00
Leo Hemsted
c63b147a2d make sure validation_failed notifications still return pdf
it's useful to see them, and previously it was crashing.
2018-10-17 16:09:30 +01:00
Leo Hemsted
7bf68e3664 fix failed sanitise flow
the move from virus scan to validation failed function was called with
the wrong variables, and had some internal logic that was slightly
wrong.

Also, Don't use `update_notification_by_id` for notifications if they
are not in `created`, `sending`, `pending`, or `sent`. It silently
doesn't update them. I didn't want to do a deeper investigation into
the reasons behind this terrifying state machine as part of this commit
so I just changed the functions to call `dao_update_notification`
manually
2018-10-16 17:30:39 +01:00
Pea Tyczynska
e22e7245fe Use sanitised pdfs for sending and handle invalid pdfs, details below:
- pass new, sanitised pdf for sending
- move invalid pdfs to a newly created bucket
- set status fro notifications that failed pdf validation to a new status validation-failed
- adjust existing tests
2018-10-16 17:30:35 +01:00
Rebecca Law
a54645c5a3 Set the postage in the filename based on the postage set on the notification. 2018-09-25 11:04:58 +01:00
Leo Hemsted
40c537a9f5 add tests 2018-08-24 15:12:02 +01:00
Rebecca Law
85a0fd3b46 Merge pull request #2031 from alphagov/fix_data_retention
Use days of retention when deleting notifications in the nightly job.
2018-08-10 15:35:42 +01:00
Leo Hemsted
1ff729e75f read first file returned from s3
rather than looping through, just use `next`. We only expect to return
one file, so this won't actually affect the code flow. If, for whatever
reason, the file isn't on s3, whereas previously it would error when
trying to return an uninstantiated variable, now it will raise a
StopIteration
2018-08-09 16:47:35 +01:00
Rebecca Law
d0e9ab4972 If the notifications that are being deleted are letters then we need to delete the letter from s3 as well. 2018-08-08 16:20:25 +01:00
Leo Hemsted
cb7f5a7166 fix letters being put in the wrong bucket when near midnight cut-off
another day, another timezone bug
2018-07-02 14:16:40 +01:00
Rebecca Law
5850663ca6 Added a method to reply letters that are in the error folder on S3.
You can replay one at a time by sending in the file name. Or all the messages in the error folder.
2018-06-27 16:40:30 +01:00
Rebecca Law
41f3293cd5 This fixes a bug where the folder name was not correctly returning the right date because date used to compare was in UTC but the datetime used for the job start time is in BST.
Units tests have been added for the affected method.
2018-05-30 10:18:37 +01:00
Rebecca Law
b73bf4220f Refactor process_letter_notification to make the code easier to read.
- Separated the logic of precompiled and template letters.
- Remove the check for research mode, research mode is not relevant to api calls. The test key is used for testing.
Refactor upload_pdf_letter to accept a precompile boolean to save a query to template.
2018-04-09 13:56:44 +01:00
Richard Chapman
023862dfdc Refactored code to be more Pythonic and make the code more readable 2018-03-27 13:32:46 +01:00
Richard Chapman
3299055a09 Refactored the shared code between
move_scanned_pdf_to_test_or_live_pdf_bucket and
move_failed_pdf to consolidate some code so it is easier to maintain in
future as so that _move_s3_object can be used for any new methods.
2018-03-27 10:32:40 +01:00
Richard Chapman
8b6d28d3b0 Added a new task to handle any error cases with the anti-virus
application. If the Anti-virus app fails due to s3 errors or ClamAV
so does not scan (even after retries) the file at all an error needs
to be raised and the notification set to technical-failure.

Files should be moved to a 'folder' a separate one for ERROR and FAILURE.

* Added new letter task to process the error
* Added a new method to letter utils.py to move a file into an error or
failure folder based on the input
* Added tests to test the task and the utils.py method
2018-03-26 14:18:44 +01:00
Ken Tsang
1a9bc2a5cf Move test letters to test letters bucket without date folder name 2018-03-23 14:59:48 +00:00
Ken Tsang
0ee5c33084 Add antivirus check on precompiled letters sent with test key
- precompiled PDFs sent by test key uploaded to scan bucket
- set status to VIRUS-SCAN-FAILED for pdfs failing virus scan rather than PERMANENT-FAILURE
- Make call to AV app for precompiled letters sent via a test key, and set notification status to PENDING-VIRUS-SCAN
2018-03-23 12:04:37 +00:00