Commit Graph

86 Commits

Author SHA1 Message Date
ccostino
35fe2c3168 Remove SocketIO (#2019)
* Remove SocketIO

We have tried a few times to enable SocketIO in our app and it has been a challenge each step of the way, and now there is a dependency incompatibility it seems.  Given that we are in maintenance mode with the applicaiton currently and hve different plans for the interface, we are removing this dependency.

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>

* Remove more unused code and address Flake8 warnings

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>

* Revert changes to fix test

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>

---------

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>
2025-10-07 12:33:43 -04:00
Kenneth Kehl
f4ba3cd7bf fix get_job_from_s3 2025-09-30 12:19:47 -07:00
Kenneth Kehl
58a8b51f59 more input checking 2025-06-26 10:35:46 -07:00
alexjanousekGSA
f0fefbef21 Fixed another 30 tests 2025-05-27 19:40:26 -04:00
alexjanousekGSA
898607b40a Removed dependency that is not needed, fixed more tests 2025-05-16 16:23:59 -04:00
Carlo Costino
419d6cee69 Update the flask-socketio config to play more nicely:
* Reverts run commands to what they previously were
* Addresses some outstanding linting/formatting
* Accounts for proper config initialization (CORS, Redis)
* Updates dependencies and pulls in latest changes from main

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>
2025-04-16 17:43:10 -04:00
Beverly Nguyen
f8726ca6b7 socket io setup 2025-04-09 16:56:42 -07:00
Kenneth Kehl
c6f222695b add more debug to get e2e tests working 2024-09-04 09:33:41 -07:00
Kenneth Kehl
3efec5f7e8 delete commented out code 2024-07-22 14:27:41 -07:00
Kenneth Kehl
6769e7ae45 merge from main 2024-07-16 10:54:46 -07:00
Kenneth Kehl
ad84fc536b remove dead code 2024-06-19 13:00:54 -07:00
Kenneth Kehl
76c34ffae6 Need magic PII-free debugging method for API 2024-06-11 10:34:57 -07:00
Kenneth Kehl
920d2c4539 debug for notify-admin-1588 2024-06-05 12:42:35 -07:00
Kenneth Kehl
905df17f65 remove datetime.utcnow() 2024-05-23 13:59:51 -07:00
Kenneth Kehl
752a13fbd2 replace utcnow with timezone naive utc call 2024-05-23 10:18:02 -07:00
Carlo Costino
99edc88197 Localize notification_utils to the API
This changeset pulls in all of the notification_utils code directly into the API and removes it as an external dependency.  We are doing this to cut down on operational maintenance of the project and will begin removing parts of it no longer needed for the API.

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>
2024-05-16 10:17:45 -04:00
Kenneth Kehl
97ba070b0e debug staging 2024-03-28 11:19:02 -07:00
Cliff Hill
ab7387acd8 All string "constants" in app.models converted to app.enums.
Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>
2024-02-28 12:43:33 -05:00
Cliff Hill
985ad27b3e Getting imports right to use app.enums
Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>
2024-02-28 12:43:32 -05:00
Cliff Hill
3982f061b6 Made enums.py for all the enums to avoid cyclic imports.
Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>
2024-02-28 12:43:31 -05:00
Cliff Hill
43f18eed6a More changes for enums.
Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>
2024-02-28 12:41:57 -05:00
Cliff Hill
820ee5a942 Cleaning up a lot of things, getting Enums used everywhere.
Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>
2024-02-28 12:40:52 -05:00
Kenneth Kehl
1ecb747c6d reformat 2023-08-29 14:54:30 -07:00
Kenneth Kehl
85604e5394 more tests 2023-08-11 11:47:57 -07:00
Kenneth Kehl
15a70460bc code review feedback 2023-06-01 07:07:10 -07:00
Kenneth Kehl
3b0d38ea39 more fix 2023-05-30 12:16:49 -07:00
Kenneth Kehl
08c1ad75c8 notify-260 remove server-side timezone handling 2023-05-10 08:39:50 -07:00
Steven Reilly
ff4190a8eb Remove letters-related code (#175)
This deletes a big ol' chunk of code related to letters. It's not everything—there are still a few things that might be tied to sms/email—but it's the the heart of letters function. SMS and email function should be untouched by this.

Areas affected:

- Things obviously about letters
- PDF tasks, used for precompiling letters
- Virus scanning, used for those PDFs
- FTP, used to send letters to the printer
- Postage stuff
2023-03-02 20:20:31 -05:00
stvnrlly
99de747a36 fix formatting 2022-11-21 11:29:38 -05:00
stvnrlly
c8533ae524 pull timezone from utils for other pytz instances 2022-11-16 16:53:55 -05:00
stvnrlly
e6d30394ba london → local 2022-11-16 14:11:52 -05:00
stvnrlly
b50cb4712f tz utility swap and many test updates 2022-11-10 12:33:25 -05:00
stvnrlly
637fbdb891 broadcast flake8 cleanup 2022-10-25 11:53:24 -04:00
stvnrlly
53204c307b tests are, uh, mostly passing 2022-10-05 01:12:35 +00:00
stvnrlly
57f4df8ed1 remove broadcast-related code, except migrations 2022-10-04 15:28:27 +00:00
Ben Thorner
33645c7747 Use notification view for status / billing tasks
This fixes a bug where (letter) notifications left in sending would
temporarily get excluded from billing and status calculations once
the service retention period had elapsed, and then get included once
again when they finally get marked as delivered.*

Status and billing tasks shouldn't need to have knowledge about which
table their data is in and getting this wrong is the fundamental cause
of the bug here. Adding a view across both tables abstracts this away
while keeping the query complexity the same.

Using a view also has the added benefit that we no longer need to care
when the status / billing tasks run in comparison to the deletion task,
since we will retrieve the same data irrespective (see below for a more
detailed discussion on data integrity).

*Such a scenario is rare but has happened.

A New View
==========

I've included all the columns that are shared between the two tables,
even though only a subset are actually needed. Having extra columns
has no impact and may be useful in future.

Although the view isn't actually a table, SQLAlchemy appears to wrap
it without any issues, noting that the package doesn't have any direct
support for "view models". Because we're never inserting data, we don't
need most of the kwargs when defining columns.*

*Note that the "default" kwarg doesn't affect data that's retrieved,
only data that's written (if no value is set).

Data Integrity
==============

The (new) tests cover the main scenarios.

We need to be careful with how the view interacts with the deletion /
archiving task. There are two concerns here:

- Duplicates. The deletion task inserts before it deletes [^1], so we
could end up double counting. It turns out this isn't a problem because
a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this
manually, just to be on the safe side.

- No data. It's conceivable that the query will check the history table
just before the insertion, then check the notifications table just after
the deletion. It turns out this isn't a problem either because the whole
query sees the same DB snapshot [^3][^4].*

*I can't think of a way to test this as it's a race condition, but I'm
confident the Postgres docs are accurate.

Performance
===========

I copied the relevant (non-PII) columns from Production for data going
back to 2022-04-01. I then ran several tests.

Queries using the new view still make use of indices on a per-table basis,
as the following query plan illustrates:

                                                                                          QUERY PLAN
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       GroupAggregate  (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1)
         Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
         ->  Sort  (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1)
               Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
               Sort Method: external merge  Disk: 9320kB
               ->  Subquery Scan on notifications_all_time_view  (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1)
                     ->  Unique  (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1)
                           ->  Sort  (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1)
                                 Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage
                                 Sort Method: external merge  Disk: 23936kB
                                 ->  Append  (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1)
                                       ->  Bitmap Heap Scan on notifications_no_pii  (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1)
                                             Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: ((key_type)::text = ANY ('{normal,team}'::text[]))
                                             ->  Bitmap Index Scan on ix_notifications_no_piiservice_id_composite  (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1)
                                                   Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                       ->  Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii  (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1)
                                             Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[]))
       Planning Time: 18.032 ms
       Execution Time: 759.001 ms
      (21 rows)

Queries using the new view appear to be slower than without, but the
differences I've seen are minimal: the original queries execute in
seconds locally and in Production, so it's not a big issue.

Notes: Performance
==================

I downloaded a minimal set of columns for testing:

      \copy (
        select
          id, notification_type, key_type, created_at, service_id,
          template_id, sent_by, rate_multiplier, international,
          billable_units, postage, job_id, notification_status
        from notifications
      ) to 'notifications.csv' delimiter ',' csv header;

      CREATE TABLE notifications_no_pii (
          id uuid NOT NULL,
          notification_type public.notification_type NOT NULL,
          key_type character varying(255) NOT NULL,
          created_at timestamp without time zone NOT NULL,
          service_id uuid,
          template_id uuid,
          sent_by character varying,
          rate_multiplier numeric,
          international boolean,
          billable_units integer NOT NULL,
          postage character varying,
          job_id uuid,
          notification_status text
      );

      copy notifications_no_pii	 from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header;

      CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at);
      CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id);
      CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at);
      CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id);

And similarly for the history table. I then created a sepatate view
across both of these temporary tables using just these columns.

To test performance I created some queries that reflect what is run
by the billing [^5] and status [^6] tasks e.g.

      explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count(*)
      from notifications_all_time_view
      where
      notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure')
      and key_type in ('normal', 'team')
      and created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      group by 1,2,3,4;

      explain analyze select template_id, job_id, key_type, notification_status, count(*)
      from notifications_all_time_view
      where created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      and key_type in ('normal', 'team')
      group by 1,2,3,4;

Between running queries I restarted my local database and also ran
a command to purge disk caches [^7].

I tested on a few services:

- c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume)
- 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume)
- 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume)

All execution results are of the same magnitude using the view compared
to the worst case of either table on its own.

[^1]: 00a04ebf54/app/dao/notifications_dao.py (L389)
[^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all
[^3]: https://www.postgresql.org/docs/current/transaction-iso.html
[^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction
[^5]: 00a04ebf54/app/dao/fact_billing_dao.py (L471)
[^6]: 00a04ebf54/app/dao/fact_notification_status_dao.py (L58)
[^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx
2022-05-19 15:14:32 +01:00
Ben Thorner
6e8f121548 Standardise how we query midnight-to-midnight
Partially addresses [1] (lots more detail to read in the comment).
I've also added some tests for the status DAO function to confirm
it behaves as expected across timezones.

[1]: https://github.com/alphagov/notifications-api/pull/3437#discussion_r802634913
2022-02-10 10:51:27 +00:00
David McDonald
ec6ed3958c Move get_prev_next_pagination_links to utils
This will mean it can later be reused whereever we want
2021-12-10 12:26:57 +00:00
Pea Tyczynska
52c529ab3a Use personalisation to set client_reference for letters
which were sent through Notify interface only. This is done
to avoid performance dip from additional operation for
other notification types.
2021-03-24 14:55:10 +00:00
Ben Thorner
a91fde2fda Run auto-correct on app/ and tests/ 2021-03-12 11:45:45 +00:00
Katie Smith
6b8ebb3421 Fix linting errors 2021-02-16 09:03:38 +00:00
Chris Hill-Scott
78e87857e3 Don’t serialize nullable UUID columns to 'None'
We should return a proper `None` instead, so it gets JSONified as `null`
and returns what you’d expect when doing `bool(model.field)`
2021-01-15 13:15:00 +00:00
Pea Tyczynska
95deb5a52f Move DATETIME_FORMAT from app to app.utils
To avoid cyclical import issues
2020-12-18 17:39:35 +00:00
Pea Tyczynska
a186d2d296 Format sequential number into an 8 char long hex
As per Vodafone spec for ibag format message number
2020-12-07 13:13:11 +00:00
Leo Hemsted
3dd15841a5 create get_dt_string_or_none string
mostly to quash flakle8 warnings
2020-07-28 12:10:18 +01:00
Leo Hemsted
7ecd7341b0 add broadcast to template_types and add broadcast_data
had to go through the code and change a few places where we filter on
template types. i specifically didn't worry about jobs or notifications.

Also, add braodcast_data - a json column that might contain arbitrary
broadcast data that we'll figure out as we go. We don't know what it'll
look like, but it should be returned by the API
2020-07-06 15:47:13 +01:00
Katie Smith
13f7fecd5b Move function to get archived email address value
This function will be used when archiving services too, so it has been
renamed and moved to `app/utils.py`.
2020-05-22 09:36:07 +01:00
Chris Hill-Scott
5ddb5a75da Use new properties of utils Templates
We’ve added some new properties to the templates in utils that we can
use instead of doing weird things like
`WithSubjectTemplate.__str__(another_instance)`
2020-04-15 16:40:42 +01:00
Leo Hemsted
d457db4164 make has_delete_task_run non-optional
just to ensure people think about the value of it when using the function
2019-12-03 14:19:14 +00:00
Leo Hemsted
913cf5e12d work out which table to get notification status data from
previously we checked notifications table, and if the results were
zero, checked the notification history table to see if there's data
in there. When we know that data isn't in notifications, we're still
checking. These queries take half a second per service, and we're
doing at least ten for each of the five thousand services we have in
notify. Most of these services have no data in either table for any
given day, and we can reduce the amount of queries we do by only
checking one table.

Check the data retention for a service, and then if the date is older
than the retention, get from history table.

NOTE: This requires that the delete tasks haven't run yet for the day!
If your retention is three days, this will look in the Notification
table for data from three days ago - expecting that shortly after the
task finishes, we'll delete that data.
2019-11-29 15:27:56 +00:00