In previous tests we check that we can deal with files that end in `pdf`
in various forms of upper and lowercase. I've moved the testing of this
behaviour into it's own test so that's explicit and not just implied
that we care about behaviour on the casing of filenames.
Note however that s3 is case sensitive and we upload all our files in
upper case so technically we'd never expect to see a file ending in
`pdf`. I've updated some of our test data to reflect this but didn't
bother doing it everywhere. I've left the test in anyway but it could be
argued that is is redundant as we don't ever expect to see that case
anyway.
Previously, when running the `collate_letter_pdfs_for_day` task, we
would only send letters that were created between 5:30pm yesterday and
5:30 today.
Now we send letters that were created before 5:30pm today and that are
still waiting to be sent. This will help us automatically attempt to
send letters that may have fallen through the gaps and not been sent the
previous day when they should have been.
Previously we solved the problem of letters that had fallen the gap by
having to run the task with a date parameter for example
`collate_letter_pdfs_for_day('2020-02-18'). We no longer need this date
parameter as we will always look back across previous days too for
letters that still need sending.
Note, we have to change from using the pagination `list_objects_v2` to
instead getting each individual notification from s3. We reduce load by
using `HEAD` rather than `GET` but this will still greatly increase the
number of API calls. We acknowledge there will be a small cost to this,
say 50p for 5000 letters and think this is tolerable. Boto3 also handles
retries itself so if when making one of the many HEAD requests, there is
a networking blip then it should be retried automatically for us.
If the letter passed sanitisation, the recipient address will be
returned from template preview, so we want to save this as the `to`
field of the notification.
Template preview will now send an encrypted dict containing all the args
to the `process_sanitised_letter` task, so this updates the task to
handle data in the new format.
Added a task, `sanitise-letter`, that will be called from antivirus when
a letter has passed virus scan. This calls a new task in
template-preview which will sanitise the PDF.
A second new task, `process-sanitised-letter`, will be called from the
template preview task and deals with updating the notification and
moving it to the relevant bucket.
Since Pytest 5, `ExceptionInfo` objects (returned by `pytest.raises`) now
have the same `str` representation as `repr`. This means that `str(e)`
now needs to be changed to `str(e.value)`.
https://github.com/pytest-dev/pytest/issues/5412
Also use this metadata to decide whether preview pages need
overlay or not. So far we have always added overlay when validation
has failed. Now we will only show it when validation failed due to
content being outside of printable area.
Code that is within a `with Python.raises(...)` context manager but
comes after the line that raises the exception doesn't get evaluated.
We had some assertions that we never being tested because of this, so
this ensures that they will always get run and fixes them where
necessary.
This has been moved to the letters utils file since it will be used in
more than one place. The notification parameter has been removed so that
the function can be used when we don't have a notification id.
The `process_virus_scan_passed` task now catches S3 errors - if these
occur, it logs an exception and puts the letter in a `technical-failure`
state. We don't retry the task, because the most common reason for
failure would be the letter not being in the expected S3 bucket, in
which case retrying would make no difference.
if we partially retry a day, we would create new zip files, containing
different letters (if some were processed succesfully). We need these
files to have different filenames to earlier zip files so that we can
avoid overwriting log data in zips_sent.
Hashing the filename means that we'll only overwrite if it was the same
file containing the same content.
DVLA don't care about the naming conventions of zip files, other than
it must start with `NOTIFY.` and end with `.ZIP`. So lets format the
date in a more readable way, and separate it from the batch number
previously ftp would name the files itself by giving them a timestamp
when uploading. we ran into issues with tasks being picked up multiple
times and as such, uploading duplicate files. By naming the file before
creating the task, we can avoid this issue.
Files are now named `NOTIFY.YYYYMMDD######.ZIP` where the number is a
counter that increments with each task we've issued in that run of
collate-letter-pdfs-for-day
The template preview app now accepts a null value for the `filename`
parameter. If a service doesn't have a letter branding option set,
previously we defaulted to their dvla_organisation (probably HM
Government). Now, we pass through None, so that we generate letters
without any logo or branding.
However, until we can create a letter without a logo, we will still default to hm-government, because the dvla_organisation is set on the service.
This does simplify the code.
Also removed the inserts to letter_branding in the data migration file, because we can deploy this before the rest of the work is finished. But we will need to do it later.
If a precompiled letter can't be opened (e.g. because it isn't a valid
PDF) we were setting its billable units to 0, but not moving it to the
invalid PDF bucket. If a precompiled letter failed sanitisation, we were
moving it to the invalid PDF bucket but not setting its billable units
to 0.
This commit makes sure that we always set the billable units to 0
and move the PDF to the right bucket if it fails sanitisation or can't be
opened.
We want to send two new headers, ServiceId and NotificationId to the
template preview /precompiled/sanitise endpoint. This is to allow us to log
errors from this endpoint in template preview with all the information needed,
instead of needing to pass the information back to notifications-api and
to log it there.
We were passing both dvla_org_id and filename to template-preview
temporarily while we switch to only using filename. Now that
template-preview is set up to use the filename, we can stop sending the
dvla_org_id too.
We now pass `filename`, the filename of the letter logo to use, through
to Template Preview in addition to the `dvla_org_id`. Once Template
Preview has been updated to only use the `filename` we will stop
sending the `dvla_org_id`.
the move from virus scan to validation failed function was called with
the wrong variables, and had some internal logic that was slightly
wrong.
Also, Don't use `update_notification_by_id` for notifications if they
are not in `created`, `sending`, `pending`, or `sent`. It silently
doesn't update them. I didn't want to do a deeper investigation into
the reasons behind this terrifying state machine as part of this commit
so I just changed the functions to call `dao_update_notification`
manually
- pass new, sanitised pdf for sending
- move invalid pdfs to a newly created bucket
- set status fro notifications that failed pdf validation to a new status validation-failed
- adjust existing tests
There was a datetime bug in the query which resulted in files not being sent to the postal provider.
The trigger-letter-pdfs-for-day task is no longer needed, so rather than fix the query just call collate_letter_pdfs_for_day directly.
Less code is always better.
Deployment considerations: I realized this is strictly not backwards compatible if the scheduled job is in progress and a task is on the queue that no longer exists. This is ok since we will deploy this well before 17:50.
be known. Added the notification id to the logging message so that
the notification can be traced through the logging system by knowing
the notification id, making it easier to debug. Also changed to raise an
exception so that alerts are generated. This way we should get an email
to say that there has been an error.