We can simplify the rest of the tests to avoid the boilerplate of
making an actual request. But it's worth keeping these two to prove
the wrapper work correctly for an arbitrary route.
Previously we had a lot of duplicate tests inconsistently checking
each of the "requires_" functions. Since both of them now use the
same "_decode_jwt_token" helper, we can consolidate all the tests
onto that. In future commits we'll look at testing the top-level
functions in terms of what they do specifically.
This switches to testing the two functions directly as trying to
test them through the top-level "requires_..." functions or calls
to endpoints doesn't scale as we add more of them.
While this has a slight risk that a "requires_..." function might
not be using these helpers, it seems unlikely and we can always
add a mock to check this if we're concerned in future.
Previously "requires_auth" and "requires_admin_auth" had similar
but different ways of checking their keys. This switches them to
use the same checks, with the admin / internal auth passing in a
fake / stub set of "api keys" to check.
Pulling out the logic this way will make it easier to unpick the
tests, so we can focus on testing what's unique to each kind of
API auth and avoid future duplication when we start calling the
"requires_internal_auth" method with other client_ids.
Note that a couple of error messages / response codes have changed
for admin / internal auth. None of these occur in practice, so we
can make them consistent with the behaviour for the public API.
Previously this was heavily duplicated but with the odd test using
a __create_token method. This adds some fixtures to remove all the
boilerplate and standardise how tokens are created in each test.
Previously we just had a single array of API keys / secrets, any of
which could be used to get past the "requires_admin_auth" check.
While multiple keys are necessary to allow for rotation, we should
avoid giving other apps access this way (too much privilege).
This converts the existing config vars into a new dictionary, keyed
by client_id. We can then use the dictionary to scope auth for new
API consumers like gov.uk/alerts to just the endpoints they need to
access, while maintaining existing access for the Admin app.
Once the new dictionary is available as a JSON environment variable,
we'll be able to remove the old credentials / config. In the next
commits, we'll look at more tests for the new functionality.
Since the expiry is sent as part of the message payload, we don't
need to invoke the CBC proxies (and indeed there's no way to do so
for an expired alert). In future we plan to extend this task so it
triggers the regeneration of content on gov.uk/alerts.
It's worth noting that 'finishes_at' can theoretically be None, in
which case it's unclear when the alert should expire. While alerts
from the Admin app should always have an expiry [1], we have many
in the DB that don't, so it's worth checking for this scenario.
[1]: 078ac10c8d/app/models/broadcast_message.py (L255)
This modifies the previous "(_)send_link_test" method to trigger a
link test for a specific lambda. We then call the method with both
the primary and failover lambda in new orchestrator method.
Since the _invoke_lambda function doesn't raise exceptions if it
fails, there's no need to rescue anything in order to ensure the
second link test / invocation runs as well. It doesn't testing for
this, since it boils to an absence of code to raise any exception.
Note that, like the other parent tests, we only check the new method
works with a specific proxy client instance.
Unlike the other IDs which are stored in the DB, this isn't relevant
for the Celery task as it invokes a link test. Moving it into the
proxy client will also enable us to generate a second ID in the next
commits, where we start doing a link test for the failover lambda.
Previously the Celery task to trigger a link test had to know about
the special case of a sequence number for Vodafone. Since we're about
to change the client to perform multiple tests it makes sense to give
it the knowledge of how to generate number itself.
Note that we have to import the db inline to avoid a circular import,
since this module is itself imported by app/__init__.py.
Other invocations of the Vodafone client use stored sequence numbers
from the DB, which are called "message numbers" in that context. Since
the two use cases are very different (even the names are different!),
having them in two places shouldn't cause any confusion.
We made a change to remove all permissions from users and invited users
when the broadcast service settings form is submitted
(https://github.com/alphagov/notifications-api/pull/3284). However, when the
form is submitted, notifications-admin always adds the `view_activity`
permission even if no permission boxes are ticked, so we don't want to
remove that one permission
(256c840b46/app/main/forms.py (L1042))
if we're served a 429, put the item on the retry queue and retry the
same as if the service returned a 5xx. 429 is commonly returned for rate
limit exceeding, and retrying on a delay is a typical response to that.
See https://github.com/alphagov/notifications-utils/pull/878 for
details.
Changes we had to make for our app and tests to work correctly
after the dependency updates:
1. Update emergency alerts polygons test because we changed
how exact we are with locations of the points on the map.
2. Use Flask's g object to set additional request attributes
So far we have been storing them in _request_ctx_stack which is
an innard for Flask's request context.
Because of major update to Werkzeug dependency, which Flask relies
on, the way we were using it stopped working, so we had a new
way to set those values.
The way we set those values now, by using g object, seems to also
be favoured in Flask documentation:
https://flask.palletsprojects.com/en/1.1.x/reqcontext/#how-the-context-works
The "normal" service permissions and broadcast service permissions are
going to be different with no overlap. This means that if you were
viewing the team members page, there might be permissions in the
database that are not visible on the frontend if a service has changed
type. For example, someone could have the 'manage_api_keys' permission,
which would not show up on the team members page of a broadcast service.
To avoid people having permissions which aren't visible in admin, we now
remove all permissions from users when their service is converted to a
broadcast service.
Permisions for invited users are also removed.
It's not possible to convert a broadcast service to a normal service, so
we don't need to cover for this scenario.
It's not a big deal if a user is no longer eligible to register a
security key, so we may as well let them continue using it. This
avoids putting them in a limbo state if we don't immediately change
their auth type when they're no longer eligible to use the feature.
Currently we have some data-driven roles to say who can use this
feature. Adding a flag in the API means we can avoid API calls in
the Admin app to determine the same.
Allowing members of the GOV.UK Notify service to use the feature
is a workaround, so we can avoid making someone a Platform Admin
before they've protected their account with it.
This means we can use it in the next commit. Also, it was surprising
for the function to be returning a tuple of values, instead of just
the service object. Since the consumers of the function only needed
the user as auditing data, it's fine to use the first team member.
This wasn't working - the error given when trying to access it was
`TypeError: Object of type 'Row' is not JSON serializable` when we tried
to serialize a SQLAlchemy Row.
I haven't looked too far into what has changed to stop this from
working, but have just changed the endpoint to return a nested list instead.
Last year we had an issue with the daily limit cache and the query that was populating it. As a result we have not been checking the daily limit properly. This PR should correct all that.
The daily limit cache is not being incremented in app.notifications.process_notifications.persist_notification, this method is and should always be the only method used to create a notification.
We increment the daily limit cache is redis is enabled (and it is always enabled for production) and the key type for the notification is team or normal.
We check if the daily limit is exceed in many places:
- app.celery.tasks.process_job
- app.v2.notifications.post_notifications.post_notification
- app.v2.notifications.post_notifications.post_precompiled_letter_notification
- app.service.send_notification.send_one_off_notification
- app.service.send_notification.send_pdf_letter_notification
If the daily limits cache is not found, set the cache to 0 with an expiry of 24 hours. The daily limit cache key is service_id-yyy-mm-dd-count, so each day a new cache is created.
The best thing about this PR is that the app.service_dao.fetch_todays_total_message_count query has been removed. This query was not performant and had been wrong for ages.
This was mentioned in an old pen test report that you could send a
request twice to set a broadcast message as broadcasting which would
trigger us to send two alerts.
It looks like this is now fixed and this test coverage backs that up.
Note, it's unlikely that it would have been an issue anyway as the CBC
would likely have rejected the message as it would notice it is a
duplicate.
Note, this test coverage is not supposed to be exhaustive of all the
potential transitions but covers the vast majority of ones that we care
about.
See `BroadcastStatusType.ALLOWED_STATUS_TRANSITIONS` for allowed
transitions.
It looks like we were allowing broadcasts to transition from draft to
broadcasting in one go. This isn't valid now. It should go draft,
pending approval and then broadcasting.
It looks like this was a leftover bit of support in our code for when we
were building stuff out and is no longer needed.
It's possible a letter can pass our validation but our print provider can not print the letter. The letter will be marked as permanent failure in this case. Typically happens with precompiled letters.
The query had a group by on notification_type and notification_status, this not only slows the query down but is wrong. The query only looked at the first result, but this query would return as many rows as different notification types and status, meaning the results do not include the correct number.
Are we concerned that all status types are included. For example letters can be cancelled or have validation-failures which shouldn't be included in the daily limit check.
This adds total_letters to the data that is returned by the
`/platform-stats/data-for-billing-report` endpoint so that we can add
total letters as a column in the CSV file that can be downloaded.
it doesn't really do any verification - that's the webauthn code in the
browser and the admin app that does that. Instead, this completes the
login flow, by marking the user as logged in in the database. Added a
docstring that explains this process a bit more, and also added a new
route: /<id>/complete/webauthn. We'll move the admin code over to use
this new url in time
The trouble is the aggregate query to return the big blue numbers on the dashboard and /notifications/{notification_type} page is taking too long to return.
I have some ideas on how to improve the query, but should take some time to do some more research and test. In the meantime, let's just ignore "todays" total numbers for the high volume services. There are only two services that this will affect.