We previously allowed MNOs to approve a broadcast themselves in training
mode and have it go out to their integration environment as per
https://github.com/alphagov/notifications-api/pull/3114
However, we want to remove this use case as it means we have to support
configuration for training mode services to do things like pick a
channel and send out alerts which we definteily don't want to do in
production.
By making this change, we reduce the chance of a single bug meaning an
alert will go out in prod that shouldn't.
Note, will also make it harder for development environment testing but I
think it is still worth it as https://www.pivotaltracker.com/story/show/177584959
will make it much harder in our code to allow some environments to send
alerts whilst in training mode.
it's important to keep tabs on when these things leave our system.
Sending a zendesk ticket that triggers a P1 is probably our simplest way
of notifying the team when this happens (it's what we do with out of
hours emergencies on the admin app too). We don't have any direct
pagerduty integrations from the api app, but we already have the zendesk
client hooked up.
After broadcasts go live, we may want to change this to a P2 (but even
then, there's arguments for keeping it P1 to start with I think).
Don't cause a P1 if it goes out on staging as that might be MNOs testing.
Now that https://github.com/alphagov/notifications-api/pull/3184 has
been deployed for a while, the `send_delivery_status_to_service` task will
always have `template_id` and `template_version` being passed in. This
means we don't need to check if those fields are there.
April 1 2021.
In this PR there is a command to set annual_billing for all active
services with the the new defaults.
The new method `set_default_free_allowance_for_service` will also be
called in a PR to follow that will set a services free allowance to the
default if the organisation for the service is changed.
for the service.
The letters rates for cronw and non crown are the same. It would be nice
to remove the need for crown but for now this is a quick fix.
This is mostly useful for letters.
For templated letters sent via interface, whether one-offs
or CSV uploads, we do not give our users a way to set client reference.
Still, they often have a placeholder with reference that we could use
to set client_reference field.
Why is this helpful?
When letter is returned, or when we experience some printing issues,
often it is difficult to identify letters after the retention period.
This change will make it easier for some users to identify letters.
It will have more impact if we inform our users of this in template
editing guidance.
This adds the `template_id` and `template_version` fields to the data
sent to services from the `send_delivery_status_to_service` task.
We need to account for the task not being passed these fields at first
since there might be tasks retrying which don't have that data. Once all
tasks have been called with the new fields we can then update the code
to assume they are always there.
Since we only send delivery status callbacks for SMS and emails, I've
removed the tests where we call that task with letters.
Names of services and orgs were confusing, and variable setting
was done in a way that made it easy to introduce errors.
Now hopefully it is more readable and more error-proof.
This is not required by DVLA and since [1] we no longer care about
the end of letter filenames when collating them, so removing it is
safe to do. Note that the name of the ZIP files of collated letters
is based on a hash of the filenames, which needed updating in tests.
Before merging this we need to do a test run in Staging, so DVLA can
check that a mixture of the old / new filenames won't cause issues.
[1]: https://github.com/alphagov/notifications-api/pull/3172
We have a scheduled task that was checking for jobs still in progress.
We saw a case where a scheduled job was stuck in a `pending` status as a
result of an app shutting down. This changes the `check_job_status` task
so that it also checks for scheduled jobs which are still pending after
30 minutes.
Previously we did some unnecessary work:
- Collate task. This had one S3 request to get a summary of the object,
which was then used in another request to get the full object. We only
need the size of the object, which is included in the summary [1].
- Archive task. This had one S3 request to get a summary of the object,
which was then used to make another request to delete it. We still need
both requests, but we can remove the S3.Object in the middle.
[1]: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#objectsummary
Previously we made a call to S3 to list objects for a letter, even
though we already had the precise key of the single object to hand.
This removes the one usage of "get_s3_bucket_objects" and uses the
filename directly in the call to remove the object.
Previously, the function would just return a presumed filename. Now that
it actually checks s3, if the file doesn't exist it'll raise an
exception. By default that's a StopIteration at the end of the bucket
iterator, which isn't ideal as this will get supressed if the function
is called within a generator loop further up or anything.
There are a couple of places where we expect the file may not exist, so
we define a custom exception to rescue specifically here. I did consider
subclassing boto's ClientError, but this wasn't straightforward as the
constructor expects to know the operation that failed, which for me is a
signal that it's not an appropriate (re-)use of the class.
Previously we generated the filename we expected a letter PDF to be
stored at in S3, and used that to retrieve it. However, the generated
filename can change over the course of a notification's lifetime e.g.
if the service changes from crown ('.C.') to non-crown ('.N.').
The prefix of the filename is stable: it's based on properties of the
notification - reference and creation - that don't change. This commit
changes the way we interact with letter PDFs in S3:
- Uploading uses the original method to generate the full file name.
The method is renamed to 'generate_' to distinguish it from the new one.
- Downloading uses a new 'find_' method to get the filename using just
its prefix, which makes it agnostic to changes in the filename suffix.
Making this change helps to decouple our code from the requirements DVLA
have on the filenames. While it means more traffic to S3, we rely on S3
in any case to download the files. From experience, we know S3 is highly
reliable and performant, so don't anticipate any issues.
In the tests we favour using moto to mock S3, so that the behaviour is
realistic. There are a couple of places where we just mock the method,
since what it returns isn't important for the test.
Note that, since the new method requires a notification object, we need
to change a query in one place, the columns of which were only selected
to appease the original method to generate a filename.
We no longer will send them any stats so therefore don't need the code
- the code to work out the nightly stats
- the performance platform client
- any configuration for the client
- any nightly tasks that kick off the sending off the stats
We will require a change in cronitor as we no longer will have this task
run meaning we need to delete the cronitor check.
We current do this as part of send-daily-performance-platform-stats but
now this moves it into its own separate task. This is for two reasons
- we will shortly get rid of the send-daily-performance-platform-stats
task as we no longer will need to send anything to performance
platform
- even if we did decide to keep the task
send-daily-performance-platform-stats and remove the specific bits
that relate to the performance platform, it's probably nicer to
rewrite the new task from scratch to make sure it's all clear and easy
to understand
having `/invite/service/<token>` and `/invite/service/<id>` as two
separate routes (the first to validate an invite token, the second to
retrieve invite metadata) technically works. Routes are matched from
first to last until a match is found. The metadata endpoint only accepts
UUIDs, so requests with a UUID will be picked up by the correct
endpoint, while requests that don't look like a UUID will carry on
searching for an endpoint, and will find the token validation endpoint.
So while this works correctly for our normal expected input, it only
does so _because the UUID endpoint is first in the file_. This isn't
great, and it makes it harder to reason about the URLs when looking at
them.
To solve this, create the new `invite/service/check/<token>` endpoint.
For backwards compatibility, assign this in parallel with the existing
route - once the admin uses the new route we can remove the old route
and make better guarantees about what endpoint is being hit.