notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2025-12-20 07:21:13 -05:00

Author	SHA1	Message	Date
Kenneth Kehl	8f5f9f8f59	Merge branch 'main' of https://github.com/GSA/notifications-api into notify-233b	2023-05-26 13:13:13 -07:00
Kenneth Kehl	6f6061455c	notify-162 delete incomplete s3 uploads (#276 ) Co-authored-by: Kenneth Kehl <@kkehl@flexion.us>	2023-05-23 11:31:30 -04:00
Kenneth Kehl	359ac9d967	merge from main	2023-05-10 09:58:03 -07:00
Kenneth Kehl	08c1ad75c8	notify-260 remove server-side timezone handling	2023-05-10 08:39:50 -07:00
Kenneth Kehl	3fb113a83e	notify-152 sms delivery receipts	2023-05-04 07:56:24 -07:00
Kenneth Kehl	6e3d3f325d	notify-233: delete notifications from notifications table after they are successfully sent	2023-04-18 12:42:23 -07:00
Ryan Ahearn	e07b596857	Remove contact list db, dao, and s3 code	2023-04-12 15:01:24 -04:00
Kenneth Kehl	27d86c949a	#224 remove crown (#228 ) Co-authored-by: Kenneth Kehl <@kkehl@flexion.us>	2023-04-11 16:29:37 -04:00
Steven Reilly	ff4190a8eb	Remove letters-related code (#175 ) This deletes a big ol' chunk of code related to letters. It's not everything—there are still a few things that might be tied to sms/email—but it's the the heart of letters function. SMS and email function should be untouched by this. Areas affected: - Things obviously about letters - PDF tasks, used for precompiling letters - Virus scanning, used for those PDFs - FTP, used to send letters to the printer - Postage stuff	2023-03-02 20:20:31 -05:00
Ryan Ahearn	71010e78d8	Fix formatting for secret code to ensure 0 padding no matter the passed length	2023-02-22 10:48:15 -05:00
Ryan Ahearn	e26bc5095c	Use cryptographically secure random number for sms codes Also, increase token length to 6 digits	2023-02-17 11:54:17 -05:00
Ryan Ahearn	041cd08097	Clean up more mmg and firetext references	2022-12-22 09:31:12 -05:00
Ryan Ahearn	45c3e3c277	Remove unused `is_delivery_slow_for_providers` method	2022-11-30 13:50:49 -05:00
stvnrlly	9e7ee1c0f8	migrate bst_date to local_date	2022-11-21 11:49:59 -05:00
stvnrlly	99de747a36	fix formatting	2022-11-21 11:29:38 -05:00
stvnrlly	c8533ae524	pull timezone from utils for other pytz instances	2022-11-16 16:53:55 -05:00
stvnrlly	e6d30394ba	london → local	2022-11-16 14:11:52 -05:00
stvnrlly	213f699c99	time adjustments in tests	2022-11-14 14:23:54 -05:00
stvnrlly	b50cb4712f	tz utility swap and many test updates	2022-11-10 12:33:25 -05:00
Steven Reilly	d37c2a53b8	Merge branch 'main' into stvnrlly-remove-broadcasts	2022-10-25 10:17:49 -04:00
stvnrlly	9f37592b1e	cleaner flake8 cleaning	2022-10-21 00:26:37 +00:00
stvnrlly	d4e156e8ae	Merge branch 'main' into stvnrlly-remove-broadcasts	2022-10-20 19:44:20 -04:00
stvnrlly	f5b5ecb661	flake8 fixes for rebased commits	2022-10-19 16:23:34 +00:00
stvnrlly	e9fdfd59f4	clean flake8 except provider code	2022-10-19 16:16:26 +00:00
stvnrlly	0186095920	swap out uk org types for us-specific org types	2022-10-11 20:27:49 +00:00
stvnrlly	57f4df8ed1	remove broadcast-related code, except migrations	2022-10-04 15:28:27 +00:00
jimmoffet	f1aec54665	clean up comments and method dupes	2022-09-15 15:48:37 -07:00
jimmoffet	b0f819dbd9	canada UK ses callbacks monster mash	2022-09-15 14:59:13 -07:00
Christa Hartsock	64b30feb08	Remove pytest from non-test file	2022-07-07 16:22:21 -07:00
Christa Hartsock	af6495cd4c	Get tests passing locally When we cloned the repository and started making modifications, we didn't initially keep tests in step. This commit tries to get us to a clean test run by skipping tests that are failing and removing some that we no longer expect to use (MMG, Firetext), with the intention that we will come back in future and update or remove them as appropriate. To find all tests skipped, search for `@pytest.mark.skip(reason="Needs updating for TTS:`. There will be a brief description of the work that needs to be done to get them passing, if known. Delete that line to make them run in a standard test run (`make test`).	2022-07-07 15:41:15 -07:00
Jim Moffet	aa4ec532a4	implement SNS	2022-06-17 11:16:23 -07:00
Ben Thorner	43dbc0891f	Merge pull request #3546 from alphagov/notification-view-178125825 Use notification view for status / billing tasks	2022-05-26 11:03:38 +01:00
Ben Thorner	aa20064f3f	Merge pull request #3545 from alphagov/remove-unused-function Remove redundant DAO function / consolidate tests	2022-05-26 11:03:30 +01:00
Ben Thorner	8e837cf681	Use table class directly instead of "table" var In response to [^1]. [^1]: https://github.com/alphagov/notifications-api/pull/3546#discussion_r879541366	2022-05-24 10:16:28 +01:00
Ben Thorner	33645c7747	Use notification view for status / billing tasks This fixes a bug where (letter) notifications left in sending would temporarily get excluded from billing and status calculations once the service retention period had elapsed, and then get included once again when they finally get marked as delivered.* Status and billing tasks shouldn't need to have knowledge about which table their data is in and getting this wrong is the fundamental cause of the bug here. Adding a view across both tables abstracts this away while keeping the query complexity the same. Using a view also has the added benefit that we no longer need to care when the status / billing tasks run in comparison to the deletion task, since we will retrieve the same data irrespective (see below for a more detailed discussion on data integrity). Such a scenario is rare but has happened. A New View ========== I've included all the columns that are shared between the two tables, even though only a subset are actually needed. Having extra columns has no impact and may be useful in future. Although the view isn't actually a table, SQLAlchemy appears to wrap it without any issues, noting that the package doesn't have any direct support for "view models". Because we're never inserting data, we don't need most of the kwargs when defining columns. Note that the "default" kwarg doesn't affect data that's retrieved, only data that's written (if no value is set). Data Integrity ============== The (new) tests cover the main scenarios. We need to be careful with how the view interacts with the deletion / archiving task. There are two concerns here: - Duplicates. The deletion task inserts before it deletes [^1], so we could end up double counting. It turns out this isn't a problem because a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this manually, just to be on the safe side. - No data. It's conceivable that the query will check the history table just before the insertion, then check the notifications table just after the deletion. It turns out this isn't a problem either because the whole query sees the same DB snapshot [^3][^4]. I can't think of a way to test this as it's a race condition, but I'm confident the Postgres docs are accurate. Performance =========== I copied the relevant (non-PII) columns from Production for data going back to 2022-04-01. I then ran several tests. Queries using the new view still make use of indices on a per-table basis, as the following query plan illustrates: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1) Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international -> Sort (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1) Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international Sort Method: external merge Disk: 9320kB -> Subquery Scan on notifications_all_time_view (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1) -> Unique (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1) -> Sort (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1) Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage Sort Method: external merge Disk: 23936kB -> Append (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1) -> Bitmap Heap Scan on notifications_no_pii (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1) Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) Filter: ((key_type)::text = ANY ('{normal,team}'::text[])) -> Bitmap Index Scan on ix_notifications_no_piiservice_id_composite (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1) Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) -> Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1) Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) Planning Time: 18.032 ms Execution Time: 759.001 ms (21 rows) Queries using the new view appear to be slower than without, but the differences I've seen are minimal: the original queries execute in seconds locally and in Production, so it's not a big issue. Notes: Performance ================== I downloaded a minimal set of columns for testing: \copy ( select id, notification_type, key_type, created_at, service_id, template_id, sent_by, rate_multiplier, international, billable_units, postage, job_id, notification_status from notifications ) to 'notifications.csv' delimiter ',' csv header; CREATE TABLE notifications_no_pii ( id uuid NOT NULL, notification_type public.notification_type NOT NULL, key_type character varying(255) NOT NULL, created_at timestamp without time zone NOT NULL, service_id uuid, template_id uuid, sent_by character varying, rate_multiplier numeric, international boolean, billable_units integer NOT NULL, postage character varying, job_id uuid, notification_status text ); copy notifications_no_pii from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header; CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at); CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id); CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at); CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at); CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at); CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id); And similarly for the history table. I then created a sepatate view across both of these temporary tables using just these columns. To test performance I created some queries that reflect what is run by the billing [^5] and status [^6] tasks e.g. explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count() from notifications_all_time_view where notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure') and key_type in ('normal', 'team') and created_at >= '2022-05-01 23:00' and created_at < '2022-05-02 23:00' and notification_type = 'sms' and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f' group by 1,2,3,4; explain analyze select template_id, job_id, key_type, notification_status, count(*) from notifications_all_time_view where created_at >= '2022-05-01 23:00' and created_at < '2022-05-02 23:00' and notification_type = 'sms' and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f' and key_type in ('normal', 'team') group by 1,2,3,4; Between running queries I restarted my local database and also ran a command to purge disk caches [^7]. I tested on a few services: - c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume) - 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume) - 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume) All execution results are of the same magnitude using the view compared to the worst case of either table on its own. [^1]: `00a04ebf54/app/dao/notifications_dao.py (L389)` [^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all [^3]: https://www.postgresql.org/docs/current/transaction-iso.html [^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction [^5]: `00a04ebf54/app/dao/fact_billing_dao.py (L471)` [^6]: `00a04ebf54/app/dao/fact_notification_status_dao.py (L58)` [^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx	2022-05-19 15:14:32 +01:00
Ben Thorner	d153603c5c	Remove redundant DAO function / consolidate tests The tests were previously covering a shared function that's now only used once, so I've inlined it and merged the tests together with a common naming that's consistent with the code under test.	2022-05-19 14:12:28 +01:00
Pea Tyczynska	c4162748de	Rename variable names for consistency between similar functions Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>	2022-05-18 12:30:35 +01:00
Pea Tyczynska	112c2ddf72	Use the new subquery in fetch_sms_billing_for_organisation() This is so we have granular data about billable units and costs so that we can handle multiple sms rates within one financial year. We also cast chargeable_units_used_so_far in that subquery to integer so we don't have type mismatch. Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>	2022-05-18 12:30:24 +01:00
Pea Tyczynska	150eaf019b	Add query_organisation_sms_usage_for_year to help fetch sms totals for org This is functionally very similar to query_service_sms_usage_for_year, except this query filters by organisation and returns for all live services within that organisation. To ensure that the cumulative free allowance counter resets properly for each service, we use the `partition_by` flag to group up the window function[^1]. This magically handles all the free allowances independently for each service. [^1]: https://www.postgresql.org/docs/current/tutorial-window.html Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>	2022-05-18 12:30:24 +01:00
Ben Thorner	7e536d1c2b	Merge pull request #3542 from alphagov/optimise-historic-billing-query-182116071 Optimise billing query for notification history	2022-05-18 10:27:43 +01:00
Ben Thorner	4a520bce78	Optimise billing query for notification history This follows the same pattern as for status aggregations [^1]. We haven't seen this problem for a long time because of [^2], but now we're trying to re-run the aggregation for some incorrect rows it's becoming apparent we need to fix it. The following query currently fails in Production after the 30 min SQLAlchemy timeout: select template_id, rate_multiplier, international, sum(billable_units), count(*) from notification_history where notification_status in ('delivered', 'sending') and key_type != 'test' and notification_type = 'sms' and service_id = '539d63a1-701d-400d-ab11-f3ee2319d4d4' and created_at >= '2021-07-07 23:00' and created_at < '2021-07-08 23:00' group by 1,2,3,4; Running a quick "explain analyze" with this change applied returns near immediately, but hangs without it. This is enough evidence for me that this change will fix the issue. [^1]: https://github.com/alphagov/notifications-api/pull/3417 [^2]: `e5c76ffda7`	2022-05-17 17:29:50 +01:00
Ben Thorner	e4a45047b3	Merge pull request #3538 from alphagov/fix-out-of-date-status-182116071 Fix out-of-date rows in ft_notification_status	2022-05-17 10:26:22 +01:00
Ben Thorner	ed379a3724	Fix out-of-date rows in ft_notification_status This can happen in the following scenario (primarily for letters): 1. A service has a mixture of "delivered" and "sending" letters, which the status task aggregates into two rows: sending \| 123 delivered \| 456 2. After the 7 day retention has passed, only the "delivered" letters will be archived [^1]. 3. The status task now looks at the history table [^2], which means it only sees the "delivered" letters. 4. The "sending" letters are eventually "delivered" and archived (before the 10 day aggregation cutoff). 5. But the status aggregation task doesn't run. This commit fixes (5). [^1]: https://github.com/alphagov/notifications-api/pull/3063 [^2]: `f87ebb094d/app/dao/fact_notification_status_dao.py (L51)`	2022-05-11 11:04:56 +01:00
Ben Thorner	1d157836ad	Remove redundant fields from service usage APIs These are no longer used since [^1] and [^2]. [^1]: https://github.com/alphagov/notifications-admin/pull/4225 [^2]: https://github.com/alphagov/notifications-admin/pull/4229	2022-05-10 15:50:22 +01:00
Leo Hemsted	51646af92e	remove provider_rates table this was added five years ago but never used. if we want to bring back variable rates per client we might as well get a fresh start since a lot has changed since then.	2022-05-03 14:42:59 +01:00
Ben Thorner	ebaef4b57b	Add "charged_units" to service usage APIs This can be calculated from the "free_allowance_used" field and the "chargeable_units" field, but having it included separately is more convenient as it can be used directly in Admin [^1]. [^1]: `417e7370bb/app/templates/views/usage.html (L38-L39)`	2022-04-27 15:57:35 +01:00
Ben Thorner	555868c442	Add "free_allowance_units" to service usage APIs This represents the number of chargeable_units that were actually free due to the free allowance - they won't be included in "cost". Although the existing calculations in Admin [^1][^2] will still be correct with a change in SMS rates - it's cost that's the problem - it makes sense to have all the knowledge about calculating usage consistently in these two APIs. Note that the Integer casting is covered by the API-level tests in test_rest. [^1]: `474d7dfda8/app/main/views/dashboard.py (L490)` [^2]: `c63660d56d/app/main/views/dashboard.py (L350)`	2022-04-27 15:57:34 +01:00
Ben Thorner	cd84928a1e	Add costs to each row in yearly usage API This will replace the manual calculations in Admin [^1][^2] for SMS and also in API [^3] for annual letter costs. Doing the calculation here also means we correctly attribute free allowance to the earliest rows in the billing table - Admin doesn't know when a given rate was applied so can't do this without making assumptions about when we change our rates. Since the calculation now depends on annual billing, we need to change all the tests to make sure a suitable row exists. I've also adjusted the test data to match the assumption that there can only be one SMS rate per bst_date. Note about "OVER" clause ======================== Using "rows=" ("ROWS BETWEEN") makes more sense than "range=" as we want the remainder to be incremental within each group in a "GROUP BY" clause, as well as between groups i.e # ROWS BETWEEN (arbitrary numbers to illustrate) date=2021-04-03, units=3, cost=3.29 date=2021-04-03, units=2, cost=4.17 date=2021-04-04, units=2, cost=5.10 vs. # RANGE BETWEEN date=2021-04-03, units=3, cost=4.17 date=2021-04-03, units=2, cost=4.17 date=2021-04-04, units=2, cost=5.10 See [^4] for more details and examples. [^1]: https://github.com/alphagov/notifications-admin/blob/master/app/templates/views/usage.html#L60 [^2]: `072c3b2079/app/billing/billing_schemas.py (L37)` [^3]: `474d7dfda8/app/templates/views/usage.html (L98)` [^4]: https://learnsql.com/blog/difference-between-rows-range-window-functions/	2022-04-27 15:57:33 +01:00
Ben Thorner	fc378fed96	Prepare to replace "billing_units" in usage APIs There is no such thing as a "billing unit". The data this field contained was also a confusing mixture of two types: - For emails and letters, it was just "notifications_sent". - For SMS, it was the "chargeable_units" (billable * multiplier). This replaces the single, ambiguous "billing_units" field with "chargeable_units" and "notifications_sent" in both usage APIs. Once Admin is using them we can remove the old field.	2022-04-27 15:57:30 +01:00
Ben Thorner	80efdd2ec6	Refactor usage API queries into functions per type This makes it easier to extend each function with costs and free allowances - especially for SMS. I've chosen to duplicate the "WHERE" clause in each subquery vs. the top-level query. This will make more sense in later commits where we start adding free allowance calculations, which need to be done on a yearly basis - knowledge the subqueries should have.	2022-04-27 15:17:18 +01:00

1 2 3 4 5 ...

1536 Commits