This PR will optimize this query to use a more efficient index.
- Add notification_type to the dao_get_last_template_usage to optimize the query.
- Tested and analyzed query on production database with very significant results.
Before:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..1711.35 rows=1 width=935) (actual time=21186.053..21186.053 rows=0 loops=1)
-> Index Scan Backward using ix_notifications_created_at on notifications (cost=0.43..4607493.80 rows=2693 width=935) (actual time=21186.052..21186.052 rows=0 loops=1)
Filter: (((key_type)::text <> 'test'::text) AND (template_id = 'xxxxxx'::uuid))
Rows Removed by Filter: 8244071
Planning time: 0.112 ms
Execution time: 21186.082 ms
After:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Limit (cost=5323.10..5323.10 rows=1 width=935)
-> Sort (cost=5323.10..5323.74 rows=258 width=935)
Sort Key: created_at DESC
-> Index Scan using ix_notifications_template_id on notifications (cost=0.56..5321.81 rows=258 width=935)
Index Cond: (template_id = 'xxxxx'::uuid)
Filter: (((key_type)::text <> 'test'::text) AND (notification_type = 'sms'::notification_type))
Planning time: 1.102 ms
Execution time: 0.584 ms
As it turns out the SNS topic is trigger more that once when a file is placed in S3, this is caused by a bug in the s3ftp software used to mount the S3 bucket to the FTP server.
S3ftp performs the create file operation more than once. This is resulting in duplicate counts of DailySortedLetter (the counts of how many letters marked as sorted or unsorted, which affect how much the provider will charge Notify for the letter).
This PR adds a new column to DailySortedLetter called file_name. A new unique constraint on billing_day + file_name is added. Each time we write a row to the table the counts are over written rather than aggregated.
I am aware that this PR is not backwards compatiable. However, since the code is typically triggered once a day around 13:00 then it is very unlikely an exception will occur during the deploy. Also a complete migration of the data based on all our response files on S3 will be performed soon, meaning the data will be corrected. Also if an exception does occur it is after the updates to notification status has already occurred.
We need to deal with this, it's ok when updating a notification status from delivered to delivered. But the DailySortedLetter counts are being doubled.
Adding the file_name to the table as a unique key to get around this issue. It will mean we have multiple rows for each billing_day, but that's ok we can aggregate that.
This will also give us a way to see which file created which count.
contains orgs, and unmapped services.
the orgs contain nested services - services the user is a part of that
belong to that org.
the unmapped services are any services that the user is a part of that
either don't have an org or have one that the user doesn't know about
this is so that we can filter out inactive organisations and services
note: can't remove user schema completely, as we still use it in
POST /user to create new users
and the client reference. It could confuse the client consumer so best
to remove it from the response altogether.
* Respond with only the notification_id and client_reference
* Updated the test to check for the response without the template
The template in precompiled is only used internally and cannot be
accessed from the UI. It could confuse the client consumer so best to
remove it from the response altogether.
* Remove the template from the dict before returning it and therefore
ensuring it still stays similar to letter
* Updated the test to check for the response without the template
SQLAlchemy handles escaping anything that could allow a SQL injection
attack. But it doesn’t escape the characters used for wildcard
searching. This is the reason we’re able to do `.like('%example%')`
at all.
But we shouldn’t be letting our users search with wildcard characters,
so we need to escape them. Which is what this commit does.
We were previously expecting the letter response files to be in the
format of 'NOTIFY.<datetime>.RSP.TXT' but the response files we receive
use '-' in the filenames instead of '.' which was causing an error when
we tried to get the date from the filename.
In the 'update-letter-notifications-statuses' task we want to ensure
that temporary failures are always handled, regardless of whether the
response file we receive contains unknown Sorted statuses or not.
Added the following new notification statuses:
* pending_virus_check
* virus_scan_failed
If we decide to remove these statuses in future, we will need to replace
them with a different status in the notifications and
Notification_history tables where they are referenced, so
pending-virus-check will be replaced with sending, and virus-scan-failed
will be replaced with permanent-failure.
This was overly optimistic that 2G would be enough to handle 4 worker
processes as they are already exhausting the 2G limit.
Depending on performance we may need to tweak the memory instead/too.
When the notification is timedout by the scheduled task if the service is expecting a status update, that update to the service would fail.
A test has been added.
* Removed extra log messages so there are not two log messages being
generated per exception, as InvalidRequest also logs, updated the
InvalidRequest log message to include the exception type and exception
information
* Added extra asserts to ensure the exception messages are printed
* Deleted the statistics DAO
(this was used for the job statistics tasks)
* Deleted the functions in the jobs DAO which are no longer used
(the functions that were used for the job-stats endpoints)