Commit Graph

4309 Commits

Author SHA1 Message Date
Ryan Ahearn
ac7ad61e10 Replace old notifications-api service prefix with notify-api 2022-11-07 12:10:49 -05:00
Ryan Ahearn
e576bbc7e2 Merge pull request #104 from GSA/fix-deployed-db-string
Fix deployed db string
2022-11-01 13:02:16 -04:00
stvnrlly
f2f70f42a6 Merge branch 'main' into stvnrlly-create-user-command 2022-11-01 12:04:01 -04:00
Ryan Ahearn
b4256d0a6c Properly set database connection string in cloud.gov 2022-11-01 11:34:00 -04:00
Ryan Ahearn
f7dce28546 Use safe env getter for test verification 2022-10-31 16:32:20 -04:00
Ryan Ahearn
41a52daca0 Clean up bucket settings 2022-10-31 15:37:12 -04:00
stvnrlly
11d123051a validate mobile number so that sms auth works 2022-10-28 14:42:25 -04:00
stvnrlly
19cdd9b052 tests & prompts for user creation command 2022-10-28 14:07:43 -04:00
stvnrlly
637fbdb891 broadcast flake8 cleanup 2022-10-25 11:53:24 -04:00
Steven Reilly
d37c2a53b8 Merge branch 'main' into stvnrlly-remove-broadcasts 2022-10-25 10:17:49 -04:00
stvnrlly
8e2b8dd7c4 keep on flakin the flake world 2022-10-21 13:29:52 +00:00
stvnrlly
9f37592b1e cleaner flake8 cleaning 2022-10-21 00:26:37 +00:00
stvnrlly
d4e156e8ae Merge branch 'main' into stvnrlly-remove-broadcasts 2022-10-20 19:44:20 -04:00
stvnrlly
5dfc26c1f5 pass pytest multiline preferences 2022-10-19 16:16:29 +00:00
stvnrlly
2d947c8d33 flake8 post-isort 2022-10-19 16:16:29 +00:00
stvnrlly
7fb471a10c test tweaks 2022-10-19 16:16:27 +00:00
stvnrlly
55adb3e035 more flake8 cleanup 2022-10-19 16:16:26 +00:00
stvnrlly
e9fdfd59f4 clean flake8 except provider code 2022-10-19 16:16:26 +00:00
Ryan Ahearn
b7e2dfa7e3 Remove unused scripts files 2022-10-18 11:54:54 -04:00
stvnrlly
b0ed88e7a3 update tests 2022-10-12 16:39:17 +00:00
stvnrlly
0186095920 swap out uk org types for us-specific org types 2022-10-11 20:27:49 +00:00
stvnrlly
53204c307b tests are, uh, mostly passing 2022-10-05 01:12:35 +00:00
jimmoffet
434b7b2d08 clean up and remove redundancy 2022-10-04 16:01:30 -07:00
stvnrlly
57f4df8ed1 remove broadcast-related code, except migrations 2022-10-04 15:28:27 +00:00
jimmoffet
fc9e4107c1 all tests passing 2022-10-03 20:07:42 -07:00
jimmoffet
c04d1df6b3 fixing tests 2022-10-03 17:16:59 -07:00
Jim Moffet
d0bba8a8bd Merge branch 'main' into jim/091422/deliverycallbacks 2022-09-30 11:21:46 -04:00
jimmoffet
48af6f7c23 fix tests 2022-09-30 10:59:48 -04:00
Ryan Ahearn
d37b8b841e Update redis url to use rediss protocol 2022-09-29 10:41:48 -04:00
Ryan Ahearn
e3ad01119d Replace celery[sqs] with celery[redis] 2022-09-29 08:59:17 -04:00
Ryan Ahearn
538d2cbe4c Proactively specify aws region for s3 operations 2022-09-26 10:56:59 -04:00
Ryan Ahearn
8ede076708 Use correct access credentials for each bucket 2022-09-22 12:14:25 -04:00
Ryan Ahearn
e9815a6f8e Create s3 buckets via terraform and bind to app 2022-09-21 11:22:55 -04:00
Ryan Ahearn
cb4036b1b0 Disable letter-based S3 buckets 2022-09-21 11:22:55 -04:00
jimmoffet
a03de0dd56 remove outdated validatesns library and replace with maintainable code 2022-09-20 20:11:09 -07:00
jimmoffet
f1aec54665 clean up comments and method dupes 2022-09-15 15:48:37 -07:00
jimmoffet
b0f819dbd9 canada UK ses callbacks monster mash 2022-09-15 14:59:13 -07:00
jimmoffet
6d0fd97b3e skip two failing redis tests 2022-08-02 16:55:21 -07:00
Christa Hartsock
c4cdaed683 Skip tests that fail because of timezone handling 2022-07-07 15:41:16 -07:00
Christa Hartsock
041a892e86 Pull admin base url from test config in tests 2022-07-07 15:41:16 -07:00
Christa Hartsock
af6495cd4c Get tests passing locally
When we cloned the repository and started making modifications, we
didn't initially keep tests in step. This commit tries to get us to a
clean test run by skipping tests that are failing and removing some
that we no longer expect to use (MMG, Firetext), with the intention that
we will come back in future and update or remove them as appropriate.

To find all tests skipped, search for `@pytest.mark.skip(reason="Needs
updating for TTS:`. There will be a brief description of the work that
needs to be done to get them passing, if known. Delete that line to make
them run in a standard test run (`make test`).
2022-07-07 15:41:15 -07:00
Jim Moffet
aa4ec532a4 implement SNS 2022-06-17 11:16:23 -07:00
Ben Thorner
ee8e86f409 Bump utils to version 56.0.0
The only impactful change is the major version itself, where I've
fixed the breaking changes due to the upgrade of PyPDF2 [^1] and
checked there are no deprecation warnings when I run the tests.

[^1]: https://github.com/alphagov/notifications-utils/pull/973
2022-06-01 14:27:25 +01:00
Ben Thorner
43dbc0891f Merge pull request #3546 from alphagov/notification-view-178125825
Use notification view for status / billing tasks
2022-05-26 11:03:38 +01:00
Ben Thorner
aa20064f3f Merge pull request #3545 from alphagov/remove-unused-function
Remove redundant DAO function / consolidate tests
2022-05-26 11:03:30 +01:00
Katie Smith
8e7f2615a9 Fix test assertion
This test was calling `.load` on model objects, when it should have
been calling `.dump`. This was not working as expected before the
marshmallow upgrade either - the objects returned were errors and not
template versions.
2022-05-25 11:35:44 +01:00
Katie Smith
21c943484d Change test that was failing due to new Marshmallow behaviour
Boolean fields in marshmallow have various values that get changed to
True or False. The value 'Yes' now gets changed to True,  which was
causing a test to start failing. We could change the schemas to stop
'Yes' from being changed to True, but the data for boolean fields comes
from admin, where it is only allowed to have certain values anyway so
this just fixes the test.
2022-05-25 11:35:44 +01:00
Katie Smith
8ae2b0bb31 Replace how .dump is called
As with `.load`, only data is now returned instead of a tuple.
2022-05-25 11:35:44 +01:00
Katie Smith
bd4f74b359 Replace how .load is called
https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-are-always-strict

`.load` doesn't return a `(data, errors)` tuple any more - only data is
returned. A `ValidationError` is raised if validation fails. The code
now relies on the `marshmallow_validation_error` error handler to handle
errors instead of having to raise an `InvalidRequest`. This has no
effect on the response that is returned (a test has been modified to
check).

Also added a new `password` field to the `UserSchema` so that we don't
have to specially check for password errors in the `.create_user` endpoint
- we can let marshmallow handle them.
2022-05-25 11:35:44 +01:00
Ben Thorner
33645c7747 Use notification view for status / billing tasks
This fixes a bug where (letter) notifications left in sending would
temporarily get excluded from billing and status calculations once
the service retention period had elapsed, and then get included once
again when they finally get marked as delivered.*

Status and billing tasks shouldn't need to have knowledge about which
table their data is in and getting this wrong is the fundamental cause
of the bug here. Adding a view across both tables abstracts this away
while keeping the query complexity the same.

Using a view also has the added benefit that we no longer need to care
when the status / billing tasks run in comparison to the deletion task,
since we will retrieve the same data irrespective (see below for a more
detailed discussion on data integrity).

*Such a scenario is rare but has happened.

A New View
==========

I've included all the columns that are shared between the two tables,
even though only a subset are actually needed. Having extra columns
has no impact and may be useful in future.

Although the view isn't actually a table, SQLAlchemy appears to wrap
it without any issues, noting that the package doesn't have any direct
support for "view models". Because we're never inserting data, we don't
need most of the kwargs when defining columns.*

*Note that the "default" kwarg doesn't affect data that's retrieved,
only data that's written (if no value is set).

Data Integrity
==============

The (new) tests cover the main scenarios.

We need to be careful with how the view interacts with the deletion /
archiving task. There are two concerns here:

- Duplicates. The deletion task inserts before it deletes [^1], so we
could end up double counting. It turns out this isn't a problem because
a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this
manually, just to be on the safe side.

- No data. It's conceivable that the query will check the history table
just before the insertion, then check the notifications table just after
the deletion. It turns out this isn't a problem either because the whole
query sees the same DB snapshot [^3][^4].*

*I can't think of a way to test this as it's a race condition, but I'm
confident the Postgres docs are accurate.

Performance
===========

I copied the relevant (non-PII) columns from Production for data going
back to 2022-04-01. I then ran several tests.

Queries using the new view still make use of indices on a per-table basis,
as the following query plan illustrates:

                                                                                          QUERY PLAN
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       GroupAggregate  (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1)
         Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
         ->  Sort  (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1)
               Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
               Sort Method: external merge  Disk: 9320kB
               ->  Subquery Scan on notifications_all_time_view  (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1)
                     ->  Unique  (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1)
                           ->  Sort  (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1)
                                 Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage
                                 Sort Method: external merge  Disk: 23936kB
                                 ->  Append  (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1)
                                       ->  Bitmap Heap Scan on notifications_no_pii  (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1)
                                             Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: ((key_type)::text = ANY ('{normal,team}'::text[]))
                                             ->  Bitmap Index Scan on ix_notifications_no_piiservice_id_composite  (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1)
                                                   Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                       ->  Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii  (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1)
                                             Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[]))
       Planning Time: 18.032 ms
       Execution Time: 759.001 ms
      (21 rows)

Queries using the new view appear to be slower than without, but the
differences I've seen are minimal: the original queries execute in
seconds locally and in Production, so it's not a big issue.

Notes: Performance
==================

I downloaded a minimal set of columns for testing:

      \copy (
        select
          id, notification_type, key_type, created_at, service_id,
          template_id, sent_by, rate_multiplier, international,
          billable_units, postage, job_id, notification_status
        from notifications
      ) to 'notifications.csv' delimiter ',' csv header;

      CREATE TABLE notifications_no_pii (
          id uuid NOT NULL,
          notification_type public.notification_type NOT NULL,
          key_type character varying(255) NOT NULL,
          created_at timestamp without time zone NOT NULL,
          service_id uuid,
          template_id uuid,
          sent_by character varying,
          rate_multiplier numeric,
          international boolean,
          billable_units integer NOT NULL,
          postage character varying,
          job_id uuid,
          notification_status text
      );

      copy notifications_no_pii	 from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header;

      CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at);
      CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id);
      CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at);
      CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id);

And similarly for the history table. I then created a sepatate view
across both of these temporary tables using just these columns.

To test performance I created some queries that reflect what is run
by the billing [^5] and status [^6] tasks e.g.

      explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count(*)
      from notifications_all_time_view
      where
      notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure')
      and key_type in ('normal', 'team')
      and created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      group by 1,2,3,4;

      explain analyze select template_id, job_id, key_type, notification_status, count(*)
      from notifications_all_time_view
      where created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      and key_type in ('normal', 'team')
      group by 1,2,3,4;

Between running queries I restarted my local database and also ran
a command to purge disk caches [^7].

I tested on a few services:

- c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume)
- 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume)
- 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume)

All execution results are of the same magnitude using the view compared
to the worst case of either table on its own.

[^1]: 00a04ebf54/app/dao/notifications_dao.py (L389)
[^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all
[^3]: https://www.postgresql.org/docs/current/transaction-iso.html
[^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction
[^5]: 00a04ebf54/app/dao/fact_billing_dao.py (L471)
[^6]: 00a04ebf54/app/dao/fact_notification_status_dao.py (L58)
[^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx
2022-05-19 15:14:32 +01:00