We don't need to log this as an exception. It's not an exception, it's
behaviour that is not ideal but is still expected so therefore I've
changed it to warn. Also it removes the email we get for the exception
which is not needed as we get the zendesk ticket instead.
I've also fixed the multiline string meaning the link to the runbook is
included in the zendesk ticket.
Before the search term was either:
- an email address (or partial email address)
- a phone number (or partial phone number)
Now it can also be:
- a reference (or partial reference)
We can take a pretty good guess, by looking at the search term, whether
the thing the user is searching by email address or phone number. This
helps us:
- only show relevant notifications
- normalise the search term to give the best chance of matching what we
store in the `normalised_to` field
However we can’t look at a search term and guess whether it’s a
reference, because a reference could take any format. Therefore if the
user hasn’t told us what kind of thing their search term is, we should
stop trying to guess.
Added a task, `sanitise-letter`, that will be called from antivirus when
a letter has passed virus scan. This calls a new task in
template-preview which will sanitise the PDF.
A second new task, `process-sanitised-letter`, will be called from the
template preview task and deals with updating the notification and
moving it to the relevant bucket.
We have a team who want to find emails that might have been sent to an
incorrect address. Therefore they can’t search by the correct address,
because it won’t match.
What they do have is the reference number of the user’s application,
which is also stored in the `client_reference` field on the
notification.
So when a user is searching we should also look at the client reference,
as well as the recipient, allowing the user to enter either in the
search box.
Use .format instead of concatenation to avoid type issues
Trying to concatenate uuid onto a string was throwing an error.
Also it is not possible to use uuid in parametrize statements
it seems as it messes up with running tests on multiple threads
SMS and emails may be marked as `NOTIFICATION_PENDING`. These will be
billed as they will have been sent to the provider and will eventually
turn to a final state such as `NOTIFICATION_DELIVERED` or
`NOTIFICATION_PERMANENT_FAILURE`.
This change will fix a discrepency on the billing page were the number
of messages being billed was less than the number of messages reported
as sent on a services dashboard when some of those messages were in a
pending state.
In reality, I don't think this bug would have had any longer affects for
incorrect billing as messages would not stay in the pending state for
too long and billing calculations would happen after that point.
The table will contain notification ids for services that have returned letters. This will make it easy to query the data in Notification_history since we can join on the primary key.
sms and emails have a very predictable 72 hour lifecycle. letters, on
the other hand, have ridiculously complex lifecycles - they might not
get sent because it's a weekend, they might not get sent because they're
second class and are only processed on alternate days, they might not
get sent because a different letter in the same batch had an error that
we didn't know about. Either way, it's apparent that four days is
definitely not enough time to guarantee that letters have gone from
sending to delivered.
Extend the amount of days we process for letters to 10 days. Keep emails
and sms down at 4 to keep run-times shorter
We're deliberately not thinking about returned letters here at all.
it makes less sense once we introduce different start dates for letters
and emails. Also, we never use it, since we just call the day tasks
ourselves from commands.py
It is likely this endppoint will need additional data for the UI to display, for the first iteration this will enable the /uploads page to show both letters and jobs. Only letter uploaded by the UI are included in the resultset.
Add file name to resultset.
the nightly task won't be affected, it'll just trigger three times more
sub-tasks.
this doesn't need to be a two-part deploy because we only trigger this
overnight, so as long as the deploy completes in daytime we don't need
to worry about celery task signatures
previously we checked notifications table, and if the results were
zero, checked the notification history table to see if there's data
in there. When we know that data isn't in notifications, we're still
checking. These queries take half a second per service, and we're
doing at least ten for each of the five thousand services we have in
notify. Most of these services have no data in either table for any
given day, and we can reduce the amount of queries we do by only
checking one table.
Check the data retention for a service, and then if the date is older
than the retention, get from history table.
NOTE: This requires that the delete tasks haven't run yet for the day!
If your retention is three days, this will look in the Notification
table for data from three days ago - expecting that shortly after the
task finishes, we'll delete that data.
it's not acceptable for a constantly failing provider to take 50 minutes
to drain (5x reducing priority by 10). But similarly, we need _some_
delay, or a handful of concurrent failures will completely turn off a
provider, rendering the whole excercise kinda pointless. Setting the
delay before it tries to reduce priority again to one minute is nice
because it means that if one request times out and returns 502, then any
other requests that are in flight at that time will time out before the
one minute is up and not switch, but any requests made after the switch
that take sixty seconds to time out will affect it.
when ORM level changes are made (eg `my_model.my_column = my_value`),
the ORM will read the column definition to see if it should apply any
defaults.The updated_at columns that we use all define
`onupdate=datetime.datetime.utcnow`. We can't patch this out as the
function pointer to the original function has already been grabbed by
this at import time - so freezegun or `mocker.patch` won't work.
So we have to use the query syntax to set the `updated_at` timestamp in
the DB without going through the ORM layer.
it now looks at both providers and works out whether to deprioritise
one, rather than binary switching from one to the other. If anything
has altered the priorities in the last ten minutes it won't take any
action. If both providers are slow it also won't take any action.