This wasn't working - the error given when trying to access it was
`TypeError: Object of type 'Row' is not JSON serializable` when we tried
to serialize a SQLAlchemy Row.
I haven't looked too far into what has changed to stop this from
working, but have just changed the endpoint to return a nested list instead.
Last year we had an issue with the daily limit cache and the query that was populating it. As a result we have not been checking the daily limit properly. This PR should correct all that.
The daily limit cache is not being incremented in app.notifications.process_notifications.persist_notification, this method is and should always be the only method used to create a notification.
We increment the daily limit cache is redis is enabled (and it is always enabled for production) and the key type for the notification is team or normal.
We check if the daily limit is exceed in many places:
- app.celery.tasks.process_job
- app.v2.notifications.post_notifications.post_notification
- app.v2.notifications.post_notifications.post_precompiled_letter_notification
- app.service.send_notification.send_one_off_notification
- app.service.send_notification.send_pdf_letter_notification
If the daily limits cache is not found, set the cache to 0 with an expiry of 24 hours. The daily limit cache key is service_id-yyy-mm-dd-count, so each day a new cache is created.
The best thing about this PR is that the app.service_dao.fetch_todays_total_message_count query has been removed. This query was not performant and had been wrong for ages.
Just looks a bit tidier and less repetitive.
I’ve only done this for the serialised service because:
- we’re only checking this in places where we’re already using the
serialised service
- if we want to check this elsewhere there’s a good chance that new code
should be using the serialised service, since it’ll itself be doing
some kind of performance optimisation
It looks like we were allowing broadcasts to transition from draft to
broadcasting in one go. This isn't valid now. It should go draft,
pending approval and then broadcasting.
It looks like this was a leftover bit of support in our code for when we
were building stuff out and is no longer needed.
It's possible a letter can pass our validation but our print provider can not print the letter. The letter will be marked as permanent failure in this case. Typically happens with precompiled letters.
We had a situation where the delivery-worker app instance was terminated before the job was marked as `in-progress`, presumably because the query to check the daily limits was taking too long to complete.
If the job was in progress the `check_job_status` task would have restarted the job.
Updating the status to in-progress sooner will help.
The query had a group by on notification_type and notification_status, this not only slows the query down but is wrong. The query only looked at the first result, but this query would return as many rows as different notification types and status, meaning the results do not include the correct number.
Are we concerned that all status types are included. For example letters can be cancelled or have validation-failures which shouldn't be included in the daily limit check.
- Update the Notification and NotificationHistory model to reflect the database.
- Updates to datatypes, removal of indexes and addition of indexes.
Why?
After running the `flask db migrate` command there are many deltas because we did some work to update the notification and notification_history tables, however, the SQLAlchemy models were not updated to reflect those changes. This PR cleans up all those deltas.
However, there are still some differences that can be done but we can look at that in another PR.
This adds total_letters to the data that is returned by the
`/platform-stats/data-for-billing-report` endpoint so that we can add
total letters as a column in the CSV file that can be downloaded.
it doesn't really do any verification - that's the webauthn code in the
browser and the admin app that does that. Instead, this completes the
login flow, by marking the user as logged in in the database. Added a
docstring that explains this process a bit more, and also added a new
route: /<id>/complete/webauthn. We'll move the admin code over to use
this new url in time
The trouble is the aggregate query to return the big blue numbers on the dashboard and /notifications/{notification_type} page is taking too long to return.
I have some ideas on how to improve the query, but should take some time to do some more research and test. In the meantime, let's just ignore "todays" total numbers for the high volume services. There are only two services that this will affect.
We haven't bumped the test version for a while.
Also bumped the version of Flask and itsdangerous.
In order to fix flask warnings I needed to changed how the blueprints were registerd.
It's always going to be in the future anyway.
After some analysis the query does perform better without it.
I'll make a note to update other queries where we get todays
notification data to remove the end date filter in a separate PR.
Many of the team members do not look at emails from zendesk, adding a current_app.logger.error message for things we care about to give developers a better chance of seeing them.
I have purposely not added an erro log for `check_for_services_with_high_failure_rates_or_sending_to_tv_numbers` because it's not something we need to look at immediately.
with sms and email auth the api handles verifying logins in the
`/<user_id>/verify/code` endpoint, when it checks the code is valid etc.
The admin app has already done this for webauthn logins, but we still
need an API endpoint so that we can set up the user's db entry to have
a new logged in timestamp, a new session id (this is important for
logging out other browser sessions), etc.
Also, we need to be able to make sure that the user's max login count
isn't exceeded. If it's exceeded, we shouldn't let them log in even with
a valid webauthn check.
This endpoint is a POST where the admin passes in a json dict with key
"succesful" being True or False. True sets up the db stuff as mentioned.
False just increments the failed login count.
simplify logic by changing the dao function to require a user id and a
webauthn cred id. Note that this changes the response from a 400 to a
404 if the cred is for a different user than the supplied id.
give a minimum length to the text fields in POSTS to create/update a
credential to avoid surprising unexpected edge cases involving empty
string names etc.
Also fix tests:
First add init file so the tests are found correctly, then update
the tests after we stopped serialising webauthn
registration_response.
added some simple validation to the delete endpoint for sanity, but
generally my assumption is that more validation will happen on the admin
side.
noteably im not checking whether the credentials are duplicated, nor is
there a uniqueness constraint in the database - I'm not sure if the
credential blob will always reliably be equivalent, and I believe the
browser should hopefully take care of dupes.
so we can be in line with what the admin handles, and keep it simple on
the api side and do as little manipulation of binary data as possible.
### Minor changes
* id is a UUID we can use for referencing within notify. No relation to
the key itself.
* name is a user viewable name that can be set/edited
* fix updated_at to have onupdate, not default
### Simplify the webauthn data
credential_data is the data we store about an authenticator that we'll
use to identify the key when logging in. includes the credential_id, the
public_key, and the aaguid (which identifies the authenticator
make/model)
registration_response is the data containing audit information - in the
future we can use this to ensure that the authenticators used are of
high quality.
both of these fields are CBOR (a kind of binary json), encoded in
base64 so that they can be embedded within our regular JSON api
endpoints. we don't anticipate the api ever needing to interact with
this data directly.
This will allow admin to pass through a value of "government" for the
broadcast_channel. We don't have any logic around the value of service.broadcast_channel,
so no updates are needed to the tasks etc.
We found that non-templated broadcast messages weren’t having their
content normalised before saving into an event.
This means that stuff like `\r\n` and curly quotes were being passed
through to the CBC proxy.
This commit firstly changes templated events to use
`str(BroadcastMessageTemplate)` to normalise the content, because it’s
non-obvious that calling
`BroadcastMessageTemplate.content_with_placeholders_filled_in` also
normalises content.
Then it changes the non-templated route to also call
`str(BroadcastMessageTemplate)`, where previously it was passing the
content straight through.
We want to replace the value `None` for
service.allowed_broadcast_provider with the value of "all". As a first
step, we need to allow both values. Once notifications-admin has been
changed to pass through "all" and all the data in the database has been
updated, we can update the code to stop supporting both values.