that is already cancelled vs when it is too late to
cancel letter vs when we don't know what's the cause
of failure.
This is so we could show useful error messages to the users
and also for better debugging.
Since the expiry is sent as part of the message payload, we don't
need to invoke the CBC proxies (and indeed there's no way to do so
for an expired alert). In future we plan to extend this task so it
triggers the regeneration of content on gov.uk/alerts.
It's worth noting that 'finishes_at' can theoretically be None, in
which case it's unclear when the alert should expire. While alerts
from the Admin app should always have an expiry [1], we have many
in the DB that don't, so it's worth checking for this scenario.
[1]: 078ac10c8d/app/models/broadcast_message.py (L255)
This avoids any issues due to large payloads (e.g. with a lot of
polygons in the 'areas' field). While we may miss part of the log
in such cases, this is more than we get already anyway.
This modifies the previous "(_)send_link_test" method to trigger a
link test for a specific lambda. We then call the method with both
the primary and failover lambda in new orchestrator method.
Since the _invoke_lambda function doesn't raise exceptions if it
fails, there's no need to rescue anything in order to ensure the
second link test / invocation runs as well. It doesn't testing for
this, since it boils to an absence of code to raise any exception.
Note that, like the other parent tests, we only check the new method
works with a specific proxy client instance.
Unlike the other IDs which are stored in the DB, this isn't relevant
for the Celery task as it invokes a link test. Moving it into the
proxy client will also enable us to generate a second ID in the next
commits, where we start doing a link test for the failover lambda.
We want to get the point where we're running link tests for each
lambda independently. The tests weren't checking for the failover
mechanism for link tests, so we can just remove it.
Previously the Celery task to trigger a link test had to know about
the special case of a sequence number for Vodafone. Since we're about
to change the client to perform multiple tests it makes sense to give
it the knowledge of how to generate number itself.
Note that we have to import the db inline to avoid a circular import,
since this module is itself imported by app/__init__.py.
Other invocations of the Vodafone client use stored sequence numbers
from the DB, which are called "message numbers" in that context. Since
the two use cases are very different (even the names are different!),
having them in two places shouldn't cause any confusion.
The broadcast user permissions are changing, so to avoid confusion and
permissions which exist in the database but don't display on the
frontend we are going to remove all existing permissions for users of
broadcast services. This also updates the permissions of invited users
who are still pending.
The exception to this is the `view_activity` permission, which we always
add for broadcast users even if they have no other permissions.
(aad017a184/app/main/forms.py (L1043))
We observe high memory usage since we bumped it (along with other
things) and because it only appears on the API and not on the workers
the hypothesis is that Werkzeug is responsible for it.
We made a change to remove all permissions from users and invited users
when the broadcast service settings form is submitted
(https://github.com/alphagov/notifications-api/pull/3284). However, when the
form is submitted, notifications-admin always adds the `view_activity`
permission even if no permission boxes are ticked, so we don't want to
remove that one permission
(256c840b46/app/main/forms.py (L1042))
if we're served a 429, put the item on the retry queue and retry the
same as if the service returned a 5xx. 429 is commonly returned for rate
limit exceeding, and retrying on a delay is a typical response to that.
See https://github.com/alphagov/notifications-utils/pull/878 for
details.
Changes we had to make for our app and tests to work correctly
after the dependency updates:
1. Update emergency alerts polygons test because we changed
how exact we are with locations of the points on the map.
2. Use Flask's g object to set additional request attributes
So far we have been storing them in _request_ctx_stack which is
an innard for Flask's request context.
Because of major update to Werkzeug dependency, which Flask relies
on, the way we were using it stopped working, so we had a new
way to set those values.
The way we set those values now, by using g object, seems to also
be favoured in Flask documentation:
https://flask.palletsprojects.com/en/1.1.x/reqcontext/#how-the-context-works
Drop index concurrently will drop the index without locking out concurrent selects, inserts, updates, and deletes on the index's table namely on notifications.
The "normal" service permissions and broadcast service permissions are
going to be different with no overlap. This means that if you were
viewing the team members page, there might be permissions in the
database that are not visible on the frontend if a service has changed
type. For example, someone could have the 'manage_api_keys' permission,
which would not show up on the team members page of a broadcast service.
To avoid people having permissions which aren't visible in admin, we now
remove all permissions from users when their service is converted to a
broadcast service.
Permisions for invited users are also removed.
It's not possible to convert a broadcast service to a normal service, so
we don't need to cover for this scenario.
We want to have new permissions which will be used specifically for
broadcasts:
- `create_broadcasts`
- `approve_broadcasts`
- `reject_broadcasts`
- `cancel_broadcasts`
Cancel and reject will always go together, but having separate database
permissions makes things easier to change in the future.
The permission column of the permissions table is an enum. We can add values
in the alembic upgrade script, but removing individual values from an
enum is not supported by Postgres. To remove values, we have to recreate
the enum with the old values.
It's not a big deal if a user is no longer eligible to register a
security key, so we may as well let them continue using it. This
avoids putting them in a limbo state if we don't immediately change
their auth type when they're no longer eligible to use the feature.
Currently we have some data-driven roles to say who can use this
feature. Adding a flag in the API means we can avoid API calls in
the Admin app to determine the same.
Allowing members of the GOV.UK Notify service to use the feature
is a workaround, so we can avoid making someone a Platform Admin
before they've protected their account with it.
This means we can use it in the next commit. Also, it was surprising
for the function to be returning a tuple of values, instead of just
the service object. Since the consumers of the function only needed
the user as auditing data, it's fine to use the first team member.
This wasn't working - the error given when trying to access it was
`TypeError: Object of type 'Row' is not JSON serializable` when we tried
to serialize a SQLAlchemy Row.
I haven't looked too far into what has changed to stop this from
working, but have just changed the endpoint to return a nested list instead.
Last year we had an issue with the daily limit cache and the query that was populating it. As a result we have not been checking the daily limit properly. This PR should correct all that.
The daily limit cache is not being incremented in app.notifications.process_notifications.persist_notification, this method is and should always be the only method used to create a notification.
We increment the daily limit cache is redis is enabled (and it is always enabled for production) and the key type for the notification is team or normal.
We check if the daily limit is exceed in many places:
- app.celery.tasks.process_job
- app.v2.notifications.post_notifications.post_notification
- app.v2.notifications.post_notifications.post_precompiled_letter_notification
- app.service.send_notification.send_one_off_notification
- app.service.send_notification.send_pdf_letter_notification
If the daily limits cache is not found, set the cache to 0 with an expiry of 24 hours. The daily limit cache key is service_id-yyy-mm-dd-count, so each day a new cache is created.
The best thing about this PR is that the app.service_dao.fetch_todays_total_message_count query has been removed. This query was not performant and had been wrong for ages.