Relax lookup of letter PDFs in S3 buckets

Previously we generated the filename we expected a letter PDF to be
stored at in S3, and used that to retrieve it. However, the generated
filename can change over the course of a notification's lifetime e.g.
if the service changes from crown ('.C.') to non-crown ('.N.').

The prefix of the filename is stable: it's based on properties of the
notification - reference and creation - that don't change. This commit
changes the way we interact with letter PDFs in S3:

- Uploading uses the original method to generate the full file name.
The method is renamed to 'generate_' to distinguish it from the new one.

- Downloading uses a new 'find_' method to get the filename using just
its prefix, which makes it agnostic to changes in the filename suffix.

Making this change helps to decouple our code from the requirements DVLA
have on the filenames. While it means more traffic to S3, we rely on S3
in any case to download the files. From experience, we know S3 is highly
reliable and performant, so don't anticipate any issues.

In the tests we favour using moto to mock S3, so that the behaviour is
realistic. There are a couple of places where we just mock the method,
since what it returns isn't important for the test.

Note that, since the new method requires a notification object, we need
to change a query in one place, the columns of which were only selected
to appease the original method to generate a filename.
This commit is contained in:
Ben Thorner
2021-03-08 15:23:37 +00:00
parent 15b9cbf7ae
commit b43a367d5f
7 changed files with 147 additions and 106 deletions

View File

@@ -37,7 +37,27 @@ def get_folder_name(created_at):
return '{}/'.format(print_datetime.date())
def get_letter_pdf_filename(reference, crown, created_at, ignore_folder=False, postage=SECOND_CLASS):
def find_letter_pdf_filename(notification):
"""
Retrieve the filename of a letter from s3 by searching for it based on a prefix.
Use this when retrieving existing pdfs, so that we can be more resilient if the naming convention changes.
"""
bucket_name, prefix = get_bucket_name_and_prefix_for_notification(notification)
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)
item = next(x for x in bucket.objects.filter(Prefix=prefix))
return item.key
def generate_letter_pdf_filename(reference, crown, created_at, ignore_folder=False, postage=SECOND_CLASS):
"""
Generate a filename for putting a letter into s3 or sending to dvla.
We should only use this function when uploading data. If you need to get a letter or its metadata from s3
then use `find_letter_pdf_filename` instead.
"""
upload_file_name = LETTERS_PDF_FILE_LOCATION_STRUCTURE.format(
folder='' if ignore_folder else get_folder_name(created_at),
reference=reference,
@@ -78,7 +98,7 @@ def upload_letter_pdf(notification, pdf_data, precompiled=False):
current_app.logger.info("PDF Letter {} reference {} created at {}, {} bytes".format(
notification.id, notification.reference, notification.created_at, len(pdf_data)))
upload_file_name = get_letter_pdf_filename(
upload_file_name = generate_letter_pdf_filename(
reference=notification.reference,
crown=notification.service.crown,
created_at=notification.created_at,