notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2025-12-12 00:02:36 -05:00

Author	SHA1	Message	Date
Cliff Hill	ab7387acd8	All string "constants" in app.models converted to app.enums. Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>	2024-02-28 12:43:33 -05:00
Cliff Hill	985ad27b3e	Getting imports right to use app.enums Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>	2024-02-28 12:43:32 -05:00
Cliff Hill	3982f061b6	Made enums.py for all the enums to avoid cyclic imports. Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>	2024-02-28 12:43:31 -05:00
Cliff Hill	43f18eed6a	More changes for enums. Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>	2024-02-28 12:41:57 -05:00
Cliff Hill	820ee5a942	Cleaning up a lot of things, getting Enums used everywhere. Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>	2024-02-28 12:40:52 -05:00
Kenneth Kehl	1ecb747c6d	reformat	2023-08-29 14:54:30 -07:00
Kenneth Kehl	85604e5394	more tests	2023-08-11 11:47:57 -07:00
Kenneth Kehl	15a70460bc	code review feedback	2023-06-01 07:07:10 -07:00
Kenneth Kehl	3b0d38ea39	more fix	2023-05-30 12:16:49 -07:00
Kenneth Kehl	08c1ad75c8	notify-260 remove server-side timezone handling	2023-05-10 08:39:50 -07:00
Steven Reilly	ff4190a8eb	Remove letters-related code (#175 ) This deletes a big ol' chunk of code related to letters. It's not everything—there are still a few things that might be tied to sms/email—but it's the the heart of letters function. SMS and email function should be untouched by this. Areas affected: - Things obviously about letters - PDF tasks, used for precompiling letters - Virus scanning, used for those PDFs - FTP, used to send letters to the printer - Postage stuff	2023-03-02 20:20:31 -05:00
stvnrlly	99de747a36	fix formatting	2022-11-21 11:29:38 -05:00
stvnrlly	c8533ae524	pull timezone from utils for other pytz instances	2022-11-16 16:53:55 -05:00
stvnrlly	e6d30394ba	london → local	2022-11-16 14:11:52 -05:00
stvnrlly	b50cb4712f	tz utility swap and many test updates	2022-11-10 12:33:25 -05:00
stvnrlly	637fbdb891	broadcast flake8 cleanup	2022-10-25 11:53:24 -04:00
stvnrlly	53204c307b	tests are, uh, mostly passing	2022-10-05 01:12:35 +00:00
stvnrlly	57f4df8ed1	remove broadcast-related code, except migrations	2022-10-04 15:28:27 +00:00
Ben Thorner	33645c7747	Use notification view for status / billing tasks This fixes a bug where (letter) notifications left in sending would temporarily get excluded from billing and status calculations once the service retention period had elapsed, and then get included once again when they finally get marked as delivered.* Status and billing tasks shouldn't need to have knowledge about which table their data is in and getting this wrong is the fundamental cause of the bug here. Adding a view across both tables abstracts this away while keeping the query complexity the same. Using a view also has the added benefit that we no longer need to care when the status / billing tasks run in comparison to the deletion task, since we will retrieve the same data irrespective (see below for a more detailed discussion on data integrity). Such a scenario is rare but has happened. A New View ========== I've included all the columns that are shared between the two tables, even though only a subset are actually needed. Having extra columns has no impact and may be useful in future. Although the view isn't actually a table, SQLAlchemy appears to wrap it without any issues, noting that the package doesn't have any direct support for "view models". Because we're never inserting data, we don't need most of the kwargs when defining columns. Note that the "default" kwarg doesn't affect data that's retrieved, only data that's written (if no value is set). Data Integrity ============== The (new) tests cover the main scenarios. We need to be careful with how the view interacts with the deletion / archiving task. There are two concerns here: - Duplicates. The deletion task inserts before it deletes [^1], so we could end up double counting. It turns out this isn't a problem because a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this manually, just to be on the safe side. - No data. It's conceivable that the query will check the history table just before the insertion, then check the notifications table just after the deletion. It turns out this isn't a problem either because the whole query sees the same DB snapshot [^3][^4]. I can't think of a way to test this as it's a race condition, but I'm confident the Postgres docs are accurate. Performance =========== I copied the relevant (non-PII) columns from Production for data going back to 2022-04-01. I then ran several tests. Queries using the new view still make use of indices on a per-table basis, as the following query plan illustrates: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1) Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international -> Sort (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1) Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international Sort Method: external merge Disk: 9320kB -> Subquery Scan on notifications_all_time_view (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1) -> Unique (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1) -> Sort (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1) Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage Sort Method: external merge Disk: 23936kB -> Append (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1) -> Bitmap Heap Scan on notifications_no_pii (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1) Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) Filter: ((key_type)::text = ANY ('{normal,team}'::text[])) -> Bitmap Index Scan on ix_notifications_no_piiservice_id_composite (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1) Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) -> Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1) Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone)) Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) Planning Time: 18.032 ms Execution Time: 759.001 ms (21 rows) Queries using the new view appear to be slower than without, but the differences I've seen are minimal: the original queries execute in seconds locally and in Production, so it's not a big issue. Notes: Performance ================== I downloaded a minimal set of columns for testing: \copy ( select id, notification_type, key_type, created_at, service_id, template_id, sent_by, rate_multiplier, international, billable_units, postage, job_id, notification_status from notifications ) to 'notifications.csv' delimiter ',' csv header; CREATE TABLE notifications_no_pii ( id uuid NOT NULL, notification_type public.notification_type NOT NULL, key_type character varying(255) NOT NULL, created_at timestamp without time zone NOT NULL, service_id uuid, template_id uuid, sent_by character varying, rate_multiplier numeric, international boolean, billable_units integer NOT NULL, postage character varying, job_id uuid, notification_status text ); copy notifications_no_pii from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header; CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at); CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id); CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at); CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at); CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at); CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id); And similarly for the history table. I then created a sepatate view across both of these temporary tables using just these columns. To test performance I created some queries that reflect what is run by the billing [^5] and status [^6] tasks e.g. explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count() from notifications_all_time_view where notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure') and key_type in ('normal', 'team') and created_at >= '2022-05-01 23:00' and created_at < '2022-05-02 23:00' and notification_type = 'sms' and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f' group by 1,2,3,4; explain analyze select template_id, job_id, key_type, notification_status, count(*) from notifications_all_time_view where created_at >= '2022-05-01 23:00' and created_at < '2022-05-02 23:00' and notification_type = 'sms' and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f' and key_type in ('normal', 'team') group by 1,2,3,4; Between running queries I restarted my local database and also ran a command to purge disk caches [^7]. I tested on a few services: - c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume) - 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume) - 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume) All execution results are of the same magnitude using the view compared to the worst case of either table on its own. [^1]: `00a04ebf54/app/dao/notifications_dao.py (L389)` [^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all [^3]: https://www.postgresql.org/docs/current/transaction-iso.html [^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction [^5]: `00a04ebf54/app/dao/fact_billing_dao.py (L471)` [^6]: `00a04ebf54/app/dao/fact_notification_status_dao.py (L58)` [^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx	2022-05-19 15:14:32 +01:00
Ben Thorner	6e8f121548	Standardise how we query midnight-to-midnight Partially addresses [1] (lots more detail to read in the comment). I've also added some tests for the status DAO function to confirm it behaves as expected across timezones. [1]: https://github.com/alphagov/notifications-api/pull/3437#discussion_r802634913	2022-02-10 10:51:27 +00:00
David McDonald	ec6ed3958c	Move `get_prev_next_pagination_links` to utils This will mean it can later be reused whereever we want	2021-12-10 12:26:57 +00:00
Pea Tyczynska	52c529ab3a	Use personalisation to set client_reference for letters which were sent through Notify interface only. This is done to avoid performance dip from additional operation for other notification types.	2021-03-24 14:55:10 +00:00
Ben Thorner	a91fde2fda	Run auto-correct on app/ and tests/	2021-03-12 11:45:45 +00:00
Katie Smith	6b8ebb3421	Fix linting errors	2021-02-16 09:03:38 +00:00
Chris Hill-Scott	78e87857e3	Don’t serialize nullable UUID columns to 'None' We should return a proper `None` instead, so it gets JSONified as `null` and returns what you’d expect when doing `bool(model.field)`	2021-01-15 13:15:00 +00:00
Pea Tyczynska	95deb5a52f	Move DATETIME_FORMAT from app to app.utils To avoid cyclical import issues	2020-12-18 17:39:35 +00:00
Pea Tyczynska	a186d2d296	Format sequential number into an 8 char long hex As per Vodafone spec for ibag format message number	2020-12-07 13:13:11 +00:00
Leo Hemsted	3dd15841a5	create get_dt_string_or_none string mostly to quash flakle8 warnings	2020-07-28 12:10:18 +01:00
Leo Hemsted	7ecd7341b0	add broadcast to template_types and add broadcast_data had to go through the code and change a few places where we filter on template types. i specifically didn't worry about jobs or notifications. Also, add braodcast_data - a json column that might contain arbitrary broadcast data that we'll figure out as we go. We don't know what it'll look like, but it should be returned by the API	2020-07-06 15:47:13 +01:00
Katie Smith	13f7fecd5b	Move function to get archived email address value This function will be used when archiving services too, so it has been renamed and moved to `app/utils.py`.	2020-05-22 09:36:07 +01:00
Chris Hill-Scott	5ddb5a75da	Use new properties of utils Templates We’ve added some new properties to the templates in utils that we can use instead of doing weird things like `WithSubjectTemplate.__str__(another_instance)`	2020-04-15 16:40:42 +01:00
Leo Hemsted	d457db4164	make has_delete_task_run non-optional just to ensure people think about the value of it when using the function	2019-12-03 14:19:14 +00:00
Leo Hemsted	913cf5e12d	work out which table to get notification status data from previously we checked notifications table, and if the results were zero, checked the notification history table to see if there's data in there. When we know that data isn't in notifications, we're still checking. These queries take half a second per service, and we're doing at least ten for each of the five thousand services we have in notify. Most of these services have no data in either table for any given day, and we can reduce the amount of queries we do by only checking one table. Check the data retention for a service, and then if the date is older than the retention, get from history table. NOTE: This requires that the delete tasks haven't run yet for the day! If your retention is three days, this will look in the Notification table for data from three days ago - expecting that shortly after the task finishes, we'll delete that data.	2019-11-29 15:27:56 +00:00
Rebecca Law	6ccb242107	Merge pull request #2458 from alphagov/remove-unused-method Remove unused method.	2019-04-15 12:11:00 +01:00
Rebecca Law	1c68e0f565	Remove unused method. last_n_days was only being used in a test.	2019-04-12 10:26:46 +01:00
Chris Hill-Scott	5a7de22f55	Set default branding for NHS services The NHS is a special case because it’s not one organisation, but it does have one consistent brand. So anyone working for an NHS organisation should have their default branding set when they create a service, even if we know nothing about their specific organisation.	2019-04-08 10:27:26 +01:00
Chris Hill-Scott	f185dbecbe	Return rendered HTML when previewing a template If you’re trying to show what a Notify email will look like in your caseworking system all the API gives you at the moment is raw markdown (with the placeholders replaced). This isn’t that useful if your caseworkers have no idea what markdown is. If we also give teams the HTML then they can embed this in their systems, and the people using those systems will be able to see how headings, bulleted lists, etc. look.	2019-02-07 17:43:46 +00:00
Pea Tyczynska	ac3832a918	Remove old redis template cache	2019-01-15 14:46:40 +00:00
Pea Tyczynska	d36c4d8a78	Remove now unused methods that populated template usage redis cache	2019-01-15 14:38:45 +00:00
Katie Smith	ff06d120e8	Bump notifications-utils to 3.7.0 Bumped notifications-utils to 3.7.0. Version 3.7.0 includes the `convert_utc_to_bst` and `convert_bst_to_utc` functions and the `LETTER_PROCESSING_DEADLINE` constant, so these have been removed from this repo and anywhere using these has now been updated to get these from `notifications-utils`. Also bumped pytest by a patch version to bring in a bug fix.	2018-11-26 12:53:39 +00:00
Rebecca Law	d0cbdce6c4	Update the error message when a service does not have permission to send a notification type.	2018-11-20 11:01:48 +00:00
Leo Hemsted	267c4fc07b	bump requirements, fix pyflake8 things, unpin botocore/awscli	2018-11-07 13:39:08 +00:00
Pea Tyczynska	a69dee5e6d	Move code that escapes special chars to helper function and use it in query get_users_by_partial_email	2018-07-13 15:47:21 +01:00
Leo Hemsted	0efa223fb2	rename days_ago to midnight_n_days_ago also add some more timezone boundary tests and minor code cleanup	2018-04-30 11:50:56 +01:00
Leo Hemsted	85fd7c3869	add new tests for template statistics	2018-04-30 11:13:21 +01:00
Leo Hemsted	9e8b6fd00d	refactor template stats endpoint to read from new redis keys New redis keys are partitioned per service per day. New process is as follows: * require a count of days to filter by. Currently admin always gives 7. * for each day, check and see if there's anything in redis. There won't be if either a) redis is/was down or b) the service didn't send any notifications that day - if there isn't, go to the database and get a count out. * combine all these stats together * get the names/template types etc out of the DB at the end.	2018-04-30 11:13:21 +01:00
Leo Hemsted	5e702449cb	move days_ago to utils and make it tz aware it's used in a few places - it should definitely know what timezones are and return datetimes rather than dates, which are hard to work with in terms of figuring out how tz aware they are.	2018-04-30 11:13:21 +01:00
Chris Hill-Scott	c9882e2f9c	Bump utils to improve plain text email formatting Brings in: - [x] https://github.com/alphagov/notifications-utils/pull/438 - [x] https://github.com/alphagov/notifications-utils/pull/450 - [x] https://github.com/alphagov/notifications-utils/pull/454 Changes: - https://github.com/alphagov/notifications-utils/compare/25.3.0...26.2.0	2018-04-10 11:14:48 +01:00
Leo Hemsted	6e554188bd	add command to backfill template usage The command takes a service id and a day, grabs the historical data for that day (potentially out of notification_history), and pops it in redis (for eight days, same as if it were written to manually). also, prefix template usage key with "service" to make clear that it's a service id, and not an individual template id.	2018-04-03 16:14:47 +01:00
Leo Hemsted	8e73961f65	add new redis template usage per day key We've run into issues with redis expiring keys while we try and write to them - short lived redis TTLs aren't really sustainable for keys where we mutate the state. Template usage is a hash contained in redis where we increment a count keyed by template_id each time a message is sent for that template. But if the key expires, hincrby (redis command for incrementing a value in a hash) will re-create an empty hash. This is no good, as we need the hash to be populated with the last seven days worth of data, which we then increment further. We can't tell whether the hincrby created the key, so a different approach entirely was needed: * New redis key: <service_id>-template-usage-<YYYY-MM-DD>. Note: This YYYY-MM-DD is BTC time so it lines up nicely with ft_billing table * Incremented to from process_notification - if it doesn't exist yet, it'll be created then. * Expiry set to 8 days every time it's incremented to. Then, at read time, we'll just read the last eight days of keys from Redis, and sum them up. This works because we're only ever incrementing from that one place - never setting wholesale, never recreating the data from scratch. So we know that if the data is in redis, then it is good and accurate data. One thing we don't know and cannot reason about is what no key in redis means. It could be either of: * This is the first message that the service has sent today. * The key was deleted from redis for some reason. Since we set the TTL to so long, we'll never be writing to a key that previously expired. But if there is a redis (or operator) error and the key is deleted, then we'll have bad data - after any data loss we'll have to rebuild the data.	2018-04-03 16:12:54 +01:00

1 2

69 Commits