notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2025-12-24 01:11:38 -05:00

Author	SHA1	Message	Date
Christa Hartsock	af6495cd4c	Get tests passing locally When we cloned the repository and started making modifications, we didn't initially keep tests in step. This commit tries to get us to a clean test run by skipping tests that are failing and removing some that we no longer expect to use (MMG, Firetext), with the intention that we will come back in future and update or remove them as appropriate. To find all tests skipped, search for `@pytest.mark.skip(reason="Needs updating for TTS:`. There will be a brief description of the work that needs to be done to get them passing, if known. Delete that line to make them run in a standard test run (`make test`).	2022-07-07 15:41:15 -07:00
Christa Hartsock	bf21aa882b	Run tests in CI	2022-07-07 15:41:15 -07:00
Jim Moffet	3ac3dcd09d	config detect-secrets	2022-06-29 08:47:36 -07:00
Jim Moffet	2047cec495	Merge branch 'main' into jim/062522/configupdates	2022-06-25 18:37:13 -07:00
Jim Moffet	ac8d8d3c29	clean up config	2022-06-25 18:36:39 -07:00
Jim Moffet	6a4f34750e	clean up	2022-06-25 13:13:12 -07:00
Jim Moffet	2bcae20441	initial sms provider cleanup	2022-06-25 13:05:10 -07:00
Christa Hartsock	e773f937ed	WIP: local deployment	2022-06-23 13:39:05 -07:00
Jim Moffet	4030f9c8d7	config buckets	2022-06-23 13:05:09 -07:00
Jim Moffet	aa4ec532a4	implement SNS	2022-06-17 11:16:23 -07:00
Jim Moffet	79ba6cc1d1	disable cache persistence & env updates	2022-06-13 21:42:36 -07:00
Jim Moffet	60262d6031	config formatting	2022-06-13 13:45:07 -07:00
Jim Moffet	59b72f4853	add devcontainer configs and docker network orchestration	2022-06-13 13:16:32 -07:00
Ben Thorner	ee8e86f409	Bump utils to version 56.0.0 The only impactful change is the major version itself, where I've fixed the breaking changes due to the upgrade of PyPDF2 [^1] and checked there are no deprecation warnings when I run the tests. [^1]: https://github.com/alphagov/notifications-utils/pull/973	2022-06-01 14:27:25 +01:00
Ben Thorner	43dbc0891f	Merge pull request #3546 from alphagov/notification-view-178125825 Use notification view for status / billing tasks	2022-05-26 11:03:38 +01:00
Ben Thorner	aa20064f3f	Merge pull request #3545 from alphagov/remove-unused-function Remove redundant DAO function / consolidate tests	2022-05-26 11:03:30 +01:00
Katie Smith	82862112fa	Remove unused attributes on ServiceSchema This removes the `override_flag` attribute, which we stopped using in #98cd838510979d467191685aac38ff2e287a92c5	2022-05-25 11:35:44 +01:00
Katie Smith	4ffdd32054	Keep the marshmallow v2 way of serializing DateTimes Marshmallow v3 has changed the way that DateTimes get serialized (https://marshmallow.readthedocs.io/en/stable/upgrading.html#datetime-leaves-timezone-information-untouched-during-serialization). In order to avoid breaking anything, we want to keep the existing way of handling DateTimes for now - this could be changed later. We can't just pass a `format` argument to a DateTime field with the old format, which looked like this `2017-09-19T00:00:00+00:00`. When we tried that, Marshmallow then expected data that we are loading to also have that format, which it doesn't. This adds a new field, which serializes data in the old format but which doesn't require data that is being deserialized to have such a precise format.	2022-05-25 11:35:44 +01:00
Katie Smith	053397d8d4	Add deserialize method for ServiceSchema permissions field Due to a difference in marshmallow 3, when calling `service_schema.load()` with permissions data the permissions data was being dropped. We could fix this by allowing the ServiceSchema to include any unknown keys using but that have unexpected consequences. Instead, this change adds a method so that the schema knows how to deserialize permissions. This uses the same code that the `pre_load` method uses, but there were errors when it wasn't included in both places.	2022-05-25 11:35:44 +01:00
Katie Smith	57788e5da1	Add new schema, `TemplateSchemaNested` When we nest the `TemplateSchema` as a field on the `NotificationWithTemplateSchema`, we want to include the `is_precompiled_letter` field. However, we don't want the `is_precompiled_letter` in any of the other places that we use the `TemplateSchema`. The way we had the code was giving errors in version 3 of Marshmallow since `is_precompiled_letter` was not defined on the `TemplateSchema`. Instead of writing complicated logic around when the field should be included or excluded, this adds a new schema which we can use in the one place where we do want to include `is_precompiled_letter`.	2022-05-25 11:35:44 +01:00
Katie Smith	cfb6c9abc0	Make pre/post processors return modified data https://marshmallow.readthedocs.io/en/stable/upgrading.html#pre-post-processors-must-return-modified-data We had a few cases where the pre/post processors weren't returning anything. They now need to return the modified data.	2022-05-25 11:35:44 +01:00
Katie Smith	ab2c33f1a3	Replace `dump_to` with `data_key` https://marshmallow.readthedocs.io/en/stable/upgrading.html#load-from-and-dump-to-are-merged-into-data-key No change to how things work here, just a kwarg that has been renamed.	2022-05-25 11:35:44 +01:00
Katie Smith	28e08b429d	Keep behaviour of deserializing unknown keys Keep the marshmallow 2 behaviour of dropping unknown keys https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-raise-validationerror-when-deserializing-data-with-unknown-keys The new default is to raise an error for unknown keys. This also adds the optional fields that the `EmailDataSchema` can have to the schema - `next` and `admin_base_url`.	2022-05-25 11:35:44 +01:00
Katie Smith	fc9b3bea1d	Make the post and pre decorators take kwargs https://marshmallow.readthedocs.io/en/stable/upgrading.html#decorated-methods-and-handle-error-receive-many-and-partial	2022-05-25 11:35:44 +01:00
Katie Smith	8ae2b0bb31	Replace how `.dump` is called As with `.load`, only data is now returned instead of a tuple.	2022-05-25 11:35:44 +01:00
Katie Smith	bd4f74b359	Replace how `.load` is called https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-are-always-strict `.load` doesn't return a `(data, errors)` tuple any more - only data is returned. A `ValidationError` is raised if validation fails. The code now relies on the `marshmallow_validation_error` error handler to handle errors instead of having to raise an `InvalidRequest`. This has no effect on the response that is returned (a test has been modified to check). Also added a new `password` field to the `UserSchema` so that we don't have to specially check for password errors in the `.create_user` endpoint - we can let marshmallow handle them.	2022-05-25 11:35:44 +01:00
Ben Thorner	8e837cf681	Use table class directly instead of "table" var In response to [^1]. [^1]: https://github.com/alphagov/notifications-api/pull/3546#discussion_r879541366	2022-05-24 10:16:28 +01:00
Ben Thorner	33645c7747	Use notification view for status / billing tasks This fixes a bug where (letter) notifications left in sending would temporarily get excluded from billing and status calculations once the service retention period had elapsed, and then get included once again when they finally get marked as delivered.* Status and billing tasks shouldn't need to have knowledge about which table their data is in and getting this wrong is the fundamental cause of the bug here. Adding a view across both tables abstracts this away while keeping the query complexity the same. Using a view also has the added benefit that we no longer need to care when the status / billing tasks run in comparison to the deletion task, since we will retrieve the same data irrespective (see below for a more detailed discussion on data integrity). Such a scenario is rare but has happened. A New View ========== I've included all the columns that are shared between the two tables, even though only a subset are actually needed. Having extra columns has no impact and may be useful in future. Although the view isn't actually a table, SQLAlchemy appears to wrap it without any issues, noting that the package doesn't have any direct support for "view models". Because we're never inserting data, we don't need most of the kwargs when defining columns. Note that the "default" kwarg doesn't affect data that's retrieved, only data that's written (if no value is set). Data Integrity ============== The (new) tests cover the main scenarios. We need to be careful with how the view interacts with the deletion / archiving task. There are two concerns here: - Duplicates. The deletion task inserts before it deletes [^1], so we could end up double counting. It turns out this isn't a problem because a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this manually, just to be on the safe side. - No data. It's conceivable that the query will check the history table just before the insertion, then check the notifications table just after the deletion. It turns out this isn't a problem either because the whole query sees the same DB snapshot [^3][^4]. I can't think of a way to test this as it's a race condition, but I'm confident the Postgres docs are accurate. Performance =========== I copied the relevant (non-PII) columns from Production for data going back to 2022-04-01. I then ran several tests. Queries using the new view still make use of indices on a per-table basis, as the following query plan illustrates: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1) Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international -> Sort (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1) Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international Sort Method: external merge Disk: 9320kB -> Subquery Scan on notifications_all_time_view (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1) -> Unique (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1) -> Sort (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1) Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage Sort Method: external merge Disk: 23936kB -> Append (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1) -> Bitmap Heap Scan on notifications_no_pii (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1) Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) Filter: ((key_type)::text = ANY ('{normal,team}'::text[])) -> Bitmap Index Scan on ix_notifications_no_piiservice_id_composite (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1) Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) -> Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1) Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) Planning Time: 18.032 ms Execution Time: 759.001 ms (21 rows) Queries using the new view appear to be slower than without, but the differences I've seen are minimal: the original queries execute in seconds locally and in Production, so it's not a big issue. Notes: Performance ================== I downloaded a minimal set of columns for testing: \copy ( select id, notification_type, key_type, created_at, service_id, template_id, sent_by, rate_multiplier, international, billable_units, postage, job_id, notification_status from notifications ) to 'notifications.csv' delimiter ',' csv header; CREATE TABLE notifications_no_pii ( id uuid NOT NULL, notification_type public.notification_type NOT NULL, key_type character varying(255) NOT NULL, created_at timestamp without time zone NOT NULL, service_id uuid, template_id uuid, sent_by character varying, rate_multiplier numeric, international boolean, billable_units integer NOT NULL, postage character varying, job_id uuid, notification_status text ); copy notifications_no_pii from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header; CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at); CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id); CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at); CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at); CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at); CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id); And similarly for the history table. I then created a sepatate view across both of these temporary tables using just these columns. To test performance I created some queries that reflect what is run by the billing [^5] and status [^6] tasks e.g. explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count() from notifications_all_time_view where notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure') and key_type in ('normal', 'team') and created_at >= '2022-05-01 23:00' and created_at < '2022-05-02 23:00' and notification_type = 'sms' and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f' group by 1,2,3,4; explain analyze select template_id, job_id, key_type, notification_status, count(*) from notifications_all_time_view where created_at >= '2022-05-01 23:00' and created_at < '2022-05-02 23:00' and notification_type = 'sms' and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f' and key_type in ('normal', 'team') group by 1,2,3,4; Between running queries I restarted my local database and also ran a command to purge disk caches [^7]. I tested on a few services: - c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume) - 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume) - 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume) All execution results are of the same magnitude using the view compared to the worst case of either table on its own. [^1]: `00a04ebf54/app/dao/notifications_dao.py (L389)` [^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all [^3]: https://www.postgresql.org/docs/current/transaction-iso.html [^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction [^5]: `00a04ebf54/app/dao/fact_billing_dao.py (L471)` [^6]: `00a04ebf54/app/dao/fact_notification_status_dao.py (L58)` [^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx	2022-05-19 15:14:32 +01:00
Ben Thorner	d153603c5c	Remove redundant DAO function / consolidate tests The tests were previously covering a shared function that's now only used once, so I've inlined it and merged the tests together with a common naming that's consistent with the code under test.	2022-05-19 14:12:28 +01:00
Katie Smith	906165eeb5	Remove `strict = True` from Marshmallow schemas Since Marshmallow 3, schemas are now strict by default, and the strict parameter has been removed. There were some schemas where we had not specified `strict = True`, but we weren't relying on strict being False, so this does not cause any issues. https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-are-always-strict	2022-05-19 13:46:49 +01:00
Pea Tyczynska	00a04ebf54	Merge pull request #3529 from alphagov/multiple_rates_sql_magic_for_org_view Show correct sms cost in org view if sms rates change within financial year	2022-05-19 12:09:42 +01:00
Ben Thorner	2ad8c9d353	Merge pull request #3543 from alphagov/run-billing-for-longer-178125825 Recalculate billing rows for 10 days (prev. 4)	2022-05-19 11:27:28 +01:00
Pea Tyczynska	c4162748de	Rename variable names for consistency between similar functions Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>	2022-05-18 12:30:35 +01:00
Pea Tyczynska	112c2ddf72	Use the new subquery in fetch_sms_billing_for_organisation() This is so we have granular data about billable units and costs so that we can handle multiple sms rates within one financial year. We also cast chargeable_units_used_so_far in that subquery to integer so we don't have type mismatch. Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>	2022-05-18 12:30:24 +01:00
Pea Tyczynska	150eaf019b	Add query_organisation_sms_usage_for_year to help fetch sms totals for org This is functionally very similar to query_service_sms_usage_for_year, except this query filters by organisation and returns for all live services within that organisation. To ensure that the cumulative free allowance counter resets properly for each service, we use the `partition_by` flag to group up the window function[^1]. This magically handles all the free allowances independently for each service. [^1]: https://www.postgresql.org/docs/current/tutorial-window.html Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>	2022-05-18 12:30:24 +01:00
Ben Thorner	7e536d1c2b	Merge pull request #3542 from alphagov/optimise-historic-billing-query-182116071 Optimise billing query for notification history	2022-05-18 10:27:43 +01:00
Ben Thorner	458e997706	Recalculate billing rows for 10 days (prev. 4) This effectively reverts [^1], which was only a temporary change. I suspect the performance problem will go away with [^2]. While we've clearly been managing without this change, it resulted in several rows being left as incorrect when letter receipts were delayed. It makes sense for us to run this task for the same period as we do to aggregate statuses, as status affects billing. [^1]: `e5c76ffda7` [^2]: https://github.com/alphagov/notifications-api/pull/3542	2022-05-17 17:38:08 +01:00
Ben Thorner	4a520bce78	Optimise billing query for notification history This follows the same pattern as for status aggregations [^1]. We haven't seen this problem for a long time because of [^2], but now we're trying to re-run the aggregation for some incorrect rows it's becoming apparent we need to fix it. The following query currently fails in Production after the 30 min SQLAlchemy timeout: select template_id, rate_multiplier, international, sum(billable_units), count(*) from notification_history where notification_status in ('delivered', 'sending') and key_type != 'test' and notification_type = 'sms' and service_id = '539d63a1-701d-400d-ab11-f3ee2319d4d4' and created_at >= '2021-07-07 23:00' and created_at < '2021-07-08 23:00' group by 1,2,3,4; Running a quick "explain analyze" with this change applied returns near immediately, but hangs without it. This is enough evidence for me that this change will fix the issue. [^1]: https://github.com/alphagov/notifications-api/pull/3417 [^2]: `e5c76ffda7`	2022-05-17 17:29:50 +01:00
Ben Thorner	e4a45047b3	Merge pull request #3538 from alphagov/fix-out-of-date-status-182116071 Fix out-of-date rows in ft_notification_status	2022-05-17 10:26:22 +01:00
Ben Thorner	318cc1284a	Merge pull request #3534 from alphagov/remove-redundant-usage-fields-181935935 Remove redundant fields from service usage APIs	2022-05-11 15:12:31 +01:00
Ben Thorner	ed379a3724	Fix out-of-date rows in ft_notification_status This can happen in the following scenario (primarily for letters): 1. A service has a mixture of "delivered" and "sending" letters, which the status task aggregates into two rows: sending \| 123 delivered \| 456 2. After the 7 day retention has passed, only the "delivered" letters will be archived [^1]. 3. The status task now looks at the history table [^2], which means it only sees the "delivered" letters. 4. The "sending" letters are eventually "delivered" and archived (before the 10 day aggregation cutoff). 5. But the status aggregation task doesn't run. This commit fixes (5). [^1]: https://github.com/alphagov/notifications-api/pull/3063 [^2]: `f87ebb094d/app/dao/fact_notification_status_dao.py (L51)`	2022-05-11 11:04:56 +01:00
David McDonald	177b860865	Merge pull request #3539 from alphagov/adjust-resting-rates Adjust resting priorities for SMS providers	2022-05-11 09:48:24 +01:00
Ben Thorner	1d157836ad	Remove redundant fields from service usage APIs These are no longer used since [^1] and [^2]. [^1]: https://github.com/alphagov/notifications-admin/pull/4225 [^2]: https://github.com/alphagov/notifications-admin/pull/4229	2022-05-10 15:50:22 +01:00
David McDonald	f000e5a52b	Adjust resting priorities for SMS providers	2022-05-10 15:07:25 +01:00
Katie Smith	dd213b8d55	Ensure UserUpdatePasswordSchema only loads _password We had the `only` defined in the Meta class, and this wasn't working - any extra fields were also being loaded. This moves it to the point where the class is instantiated, which now works.	2022-05-06 10:14:50 +01:00
Katie Smith	685959de00	Don't serialize broadcast_messages in TemplateHistorySchema In `0282a76` we excluded serializing `broadcast_messages` on some schemas. This also excludes it from the TemplateHistory schema.	2022-05-06 10:14:50 +01:00
Katie Smith	ab199b6b05	Remove unused excluded fields from Marshmallow schemas We have a lot of cases in the schemas where we're excluding a field that doesn't actually exist on the schema anyway. This is often because a model has been deleted, and the schema has not been updated. These excluded fields have no effect at the moment, but Marshmallow 3 does raise an error if you try and exclude non-existent fields. There should be no change to what gets (de)serialized after this change.	2022-05-06 10:14:50 +01:00
Katie Smith	bd42bded0a	Delete unused Marshmallow schema validation functions	2022-05-04 14:37:04 +01:00
Leo Hemsted	51646af92e	remove provider_rates table this was added five years ago but never used. if we want to bring back variable rates per client we might as well get a fresh start since a lot has changed since then.	2022-05-03 14:42:59 +01:00
Ben Thorner	bb62d22f25	Merge pull request #3526 from alphagov/remove-reach Remove support for Reach provider	2022-04-29 13:21:27 +01:00

1 2 3 4 5 ...

5070 Commits