Commit Graph

1536 Commits

Author SHA1 Message Date
Kenneth Kehl
8f5f9f8f59 Merge branch 'main' of https://github.com/GSA/notifications-api into notify-233b 2023-05-26 13:13:13 -07:00
Kenneth Kehl
6f6061455c notify-162 delete incomplete s3 uploads (#276)
Co-authored-by: Kenneth Kehl <@kkehl@flexion.us>
2023-05-23 11:31:30 -04:00
Kenneth Kehl
359ac9d967 merge from main 2023-05-10 09:58:03 -07:00
Kenneth Kehl
08c1ad75c8 notify-260 remove server-side timezone handling 2023-05-10 08:39:50 -07:00
Kenneth Kehl
3fb113a83e notify-152 sms delivery receipts 2023-05-04 07:56:24 -07:00
Kenneth Kehl
6e3d3f325d notify-233: delete notifications from notifications table after they are successfully sent 2023-04-18 12:42:23 -07:00
Ryan Ahearn
e07b596857 Remove contact list db, dao, and s3 code 2023-04-12 15:01:24 -04:00
Kenneth Kehl
27d86c949a #224 remove crown (#228)
Co-authored-by: Kenneth Kehl <@kkehl@flexion.us>
2023-04-11 16:29:37 -04:00
Steven Reilly
ff4190a8eb Remove letters-related code (#175)
This deletes a big ol' chunk of code related to letters. It's not everything—there are still a few things that might be tied to sms/email—but it's the the heart of letters function. SMS and email function should be untouched by this.

Areas affected:

- Things obviously about letters
- PDF tasks, used for precompiling letters
- Virus scanning, used for those PDFs
- FTP, used to send letters to the printer
- Postage stuff
2023-03-02 20:20:31 -05:00
Ryan Ahearn
71010e78d8 Fix formatting for secret code to ensure 0 padding no matter the passed length 2023-02-22 10:48:15 -05:00
Ryan Ahearn
e26bc5095c Use cryptographically secure random number for sms codes
Also, increase token length to 6 digits
2023-02-17 11:54:17 -05:00
Ryan Ahearn
041cd08097 Clean up more mmg and firetext references 2022-12-22 09:31:12 -05:00
Ryan Ahearn
45c3e3c277 Remove unused is_delivery_slow_for_providers method 2022-11-30 13:50:49 -05:00
stvnrlly
9e7ee1c0f8 migrate bst_date to local_date 2022-11-21 11:49:59 -05:00
stvnrlly
99de747a36 fix formatting 2022-11-21 11:29:38 -05:00
stvnrlly
c8533ae524 pull timezone from utils for other pytz instances 2022-11-16 16:53:55 -05:00
stvnrlly
e6d30394ba london → local 2022-11-16 14:11:52 -05:00
stvnrlly
213f699c99 time adjustments in tests 2022-11-14 14:23:54 -05:00
stvnrlly
b50cb4712f tz utility swap and many test updates 2022-11-10 12:33:25 -05:00
Steven Reilly
d37c2a53b8 Merge branch 'main' into stvnrlly-remove-broadcasts 2022-10-25 10:17:49 -04:00
stvnrlly
9f37592b1e cleaner flake8 cleaning 2022-10-21 00:26:37 +00:00
stvnrlly
d4e156e8ae Merge branch 'main' into stvnrlly-remove-broadcasts 2022-10-20 19:44:20 -04:00
stvnrlly
f5b5ecb661 flake8 fixes for rebased commits 2022-10-19 16:23:34 +00:00
stvnrlly
e9fdfd59f4 clean flake8 except provider code 2022-10-19 16:16:26 +00:00
stvnrlly
0186095920 swap out uk org types for us-specific org types 2022-10-11 20:27:49 +00:00
stvnrlly
57f4df8ed1 remove broadcast-related code, except migrations 2022-10-04 15:28:27 +00:00
jimmoffet
f1aec54665 clean up comments and method dupes 2022-09-15 15:48:37 -07:00
jimmoffet
b0f819dbd9 canada UK ses callbacks monster mash 2022-09-15 14:59:13 -07:00
Christa Hartsock
64b30feb08 Remove pytest from non-test file 2022-07-07 16:22:21 -07:00
Christa Hartsock
af6495cd4c Get tests passing locally
When we cloned the repository and started making modifications, we
didn't initially keep tests in step. This commit tries to get us to a
clean test run by skipping tests that are failing and removing some
that we no longer expect to use (MMG, Firetext), with the intention that
we will come back in future and update or remove them as appropriate.

To find all tests skipped, search for `@pytest.mark.skip(reason="Needs
updating for TTS:`. There will be a brief description of the work that
needs to be done to get them passing, if known. Delete that line to make
them run in a standard test run (`make test`).
2022-07-07 15:41:15 -07:00
Jim Moffet
aa4ec532a4 implement SNS 2022-06-17 11:16:23 -07:00
Ben Thorner
43dbc0891f Merge pull request #3546 from alphagov/notification-view-178125825
Use notification view for status / billing tasks
2022-05-26 11:03:38 +01:00
Ben Thorner
aa20064f3f Merge pull request #3545 from alphagov/remove-unused-function
Remove redundant DAO function / consolidate tests
2022-05-26 11:03:30 +01:00
Ben Thorner
8e837cf681 Use table class directly instead of "table" var
In response to [^1].

[^1]: https://github.com/alphagov/notifications-api/pull/3546#discussion_r879541366
2022-05-24 10:16:28 +01:00
Ben Thorner
33645c7747 Use notification view for status / billing tasks
This fixes a bug where (letter) notifications left in sending would
temporarily get excluded from billing and status calculations once
the service retention period had elapsed, and then get included once
again when they finally get marked as delivered.*

Status and billing tasks shouldn't need to have knowledge about which
table their data is in and getting this wrong is the fundamental cause
of the bug here. Adding a view across both tables abstracts this away
while keeping the query complexity the same.

Using a view also has the added benefit that we no longer need to care
when the status / billing tasks run in comparison to the deletion task,
since we will retrieve the same data irrespective (see below for a more
detailed discussion on data integrity).

*Such a scenario is rare but has happened.

A New View
==========

I've included all the columns that are shared between the two tables,
even though only a subset are actually needed. Having extra columns
has no impact and may be useful in future.

Although the view isn't actually a table, SQLAlchemy appears to wrap
it without any issues, noting that the package doesn't have any direct
support for "view models". Because we're never inserting data, we don't
need most of the kwargs when defining columns.*

*Note that the "default" kwarg doesn't affect data that's retrieved,
only data that's written (if no value is set).

Data Integrity
==============

The (new) tests cover the main scenarios.

We need to be careful with how the view interacts with the deletion /
archiving task. There are two concerns here:

- Duplicates. The deletion task inserts before it deletes [^1], so we
could end up double counting. It turns out this isn't a problem because
a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this
manually, just to be on the safe side.

- No data. It's conceivable that the query will check the history table
just before the insertion, then check the notifications table just after
the deletion. It turns out this isn't a problem either because the whole
query sees the same DB snapshot [^3][^4].*

*I can't think of a way to test this as it's a race condition, but I'm
confident the Postgres docs are accurate.

Performance
===========

I copied the relevant (non-PII) columns from Production for data going
back to 2022-04-01. I then ran several tests.

Queries using the new view still make use of indices on a per-table basis,
as the following query plan illustrates:

                                                                                          QUERY PLAN
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       GroupAggregate  (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1)
         Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
         ->  Sort  (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1)
               Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
               Sort Method: external merge  Disk: 9320kB
               ->  Subquery Scan on notifications_all_time_view  (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1)
                     ->  Unique  (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1)
                           ->  Sort  (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1)
                                 Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage
                                 Sort Method: external merge  Disk: 23936kB
                                 ->  Append  (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1)
                                       ->  Bitmap Heap Scan on notifications_no_pii  (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1)
                                             Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: ((key_type)::text = ANY ('{normal,team}'::text[]))
                                             ->  Bitmap Index Scan on ix_notifications_no_piiservice_id_composite  (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1)
                                                   Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                       ->  Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii  (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1)
                                             Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[]))
       Planning Time: 18.032 ms
       Execution Time: 759.001 ms
      (21 rows)

Queries using the new view appear to be slower than without, but the
differences I've seen are minimal: the original queries execute in
seconds locally and in Production, so it's not a big issue.

Notes: Performance
==================

I downloaded a minimal set of columns for testing:

      \copy (
        select
          id, notification_type, key_type, created_at, service_id,
          template_id, sent_by, rate_multiplier, international,
          billable_units, postage, job_id, notification_status
        from notifications
      ) to 'notifications.csv' delimiter ',' csv header;

      CREATE TABLE notifications_no_pii (
          id uuid NOT NULL,
          notification_type public.notification_type NOT NULL,
          key_type character varying(255) NOT NULL,
          created_at timestamp without time zone NOT NULL,
          service_id uuid,
          template_id uuid,
          sent_by character varying,
          rate_multiplier numeric,
          international boolean,
          billable_units integer NOT NULL,
          postage character varying,
          job_id uuid,
          notification_status text
      );

      copy notifications_no_pii	 from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header;

      CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at);
      CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id);
      CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at);
      CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id);

And similarly for the history table. I then created a sepatate view
across both of these temporary tables using just these columns.

To test performance I created some queries that reflect what is run
by the billing [^5] and status [^6] tasks e.g.

      explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count(*)
      from notifications_all_time_view
      where
      notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure')
      and key_type in ('normal', 'team')
      and created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      group by 1,2,3,4;

      explain analyze select template_id, job_id, key_type, notification_status, count(*)
      from notifications_all_time_view
      where created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      and key_type in ('normal', 'team')
      group by 1,2,3,4;

Between running queries I restarted my local database and also ran
a command to purge disk caches [^7].

I tested on a few services:

- c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume)
- 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume)
- 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume)

All execution results are of the same magnitude using the view compared
to the worst case of either table on its own.

[^1]: 00a04ebf54/app/dao/notifications_dao.py (L389)
[^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all
[^3]: https://www.postgresql.org/docs/current/transaction-iso.html
[^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction
[^5]: 00a04ebf54/app/dao/fact_billing_dao.py (L471)
[^6]: 00a04ebf54/app/dao/fact_notification_status_dao.py (L58)
[^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx
2022-05-19 15:14:32 +01:00
Ben Thorner
d153603c5c Remove redundant DAO function / consolidate tests
The tests were previously covering a shared function that's now
only used once, so I've inlined it and merged the tests together
with a common naming that's consistent with the code under test.
2022-05-19 14:12:28 +01:00
Pea Tyczynska
c4162748de Rename variable names for consistency between similar functions
Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>
2022-05-18 12:30:35 +01:00
Pea Tyczynska
112c2ddf72 Use the new subquery in fetch_sms_billing_for_organisation()
This is so we have granular data about billable units and costs
so that we can handle multiple sms rates within one financial
year.

We also cast chargeable_units_used_so_far in that subquery
to integer so we don't have type mismatch.

Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>
2022-05-18 12:30:24 +01:00
Pea Tyczynska
150eaf019b Add query_organisation_sms_usage_for_year to help fetch sms totals for org
This is functionally very similar to query_service_sms_usage_for_year,
except this query filters by organisation and returns for all live services
within that organisation.
To ensure that the cumulative free allowance counter resets properly for
each service, we use the `partition_by` flag to group up the window
function[^1]. This magically handles all the free allowances
independently for each service.

[^1]: https://www.postgresql.org/docs/current/tutorial-window.html

Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>
2022-05-18 12:30:24 +01:00
Ben Thorner
7e536d1c2b Merge pull request #3542 from alphagov/optimise-historic-billing-query-182116071
Optimise billing query for notification history
2022-05-18 10:27:43 +01:00
Ben Thorner
4a520bce78 Optimise billing query for notification history
This follows the same pattern as for status aggregations [^1]. We
haven't seen this problem for a long time because of [^2], but now
we're trying to re-run the aggregation for some incorrect rows it's
becoming apparent we need to fix it.

The following query currently fails in Production after the 30 min
SQLAlchemy timeout:

      select template_id, rate_multiplier, international, sum(billable_units), count(*)
      from notification_history
      where notification_status in ('delivered', 'sending')
      and key_type != 'test'
      and notification_type = 'sms'
      and service_id = '539d63a1-701d-400d-ab11-f3ee2319d4d4'
      and created_at >= '2021-07-07 23:00'
      and created_at < '2021-07-08 23:00'
      group by 1,2,3,4;

Running a quick "explain analyze" with this change applied returns
near immediately, but hangs without it. This is enough evidence for
me that this change will fix the issue.

[^1]: https://github.com/alphagov/notifications-api/pull/3417
[^2]: e5c76ffda7
2022-05-17 17:29:50 +01:00
Ben Thorner
e4a45047b3 Merge pull request #3538 from alphagov/fix-out-of-date-status-182116071
Fix out-of-date rows in ft_notification_status
2022-05-17 10:26:22 +01:00
Ben Thorner
ed379a3724 Fix out-of-date rows in ft_notification_status
This can happen in the following scenario (primarily for letters):

1. A service has a mixture of "delivered" and "sending" letters,
which the status task aggregates into two rows:

  sending | 123
  delivered | 456

2. After the 7 day retention has passed, only the "delivered" letters
will be archived [^1].

3. The status task now looks at the history table [^2], which means
it only sees the "delivered" letters.

4. The "sending" letters are eventually "delivered" and archived (before
the 10 day aggregation cutoff).

5. But the status aggregation task doesn't run.

This commit fixes (5).

[^1]: https://github.com/alphagov/notifications-api/pull/3063
[^2]: f87ebb094d/app/dao/fact_notification_status_dao.py (L51)
2022-05-11 11:04:56 +01:00
Ben Thorner
1d157836ad Remove redundant fields from service usage APIs
These are no longer used since [^1] and [^2].

[^1]: https://github.com/alphagov/notifications-admin/pull/4225
[^2]: https://github.com/alphagov/notifications-admin/pull/4229
2022-05-10 15:50:22 +01:00
Leo Hemsted
51646af92e remove provider_rates table
this was added five years ago but never used. if we want to bring back
variable rates per client we might as well get a fresh start since a lot
has changed since then.
2022-05-03 14:42:59 +01:00
Ben Thorner
ebaef4b57b Add "charged_units" to service usage APIs
This can be calculated from the "free_allowance_used" field and the
"chargeable_units" field, but having it included separately is more
convenient as it can be used directly in Admin [^1].

[^1]: 417e7370bb/app/templates/views/usage.html (L38-L39)
2022-04-27 15:57:35 +01:00
Ben Thorner
555868c442 Add "free_allowance_units" to service usage APIs
This represents the number of chargeable_units that were actually
free due to the free allowance - they won't be included in "cost".

Although the existing calculations in Admin [^1][^2] will still be
correct with a change in SMS rates - it's cost that's the problem
- it makes sense to have all the knowledge about calculating usage
consistently in these two APIs.

Note that the Integer casting is covered by the API-level tests in
test_rest.

[^1]: 474d7dfda8/app/main/views/dashboard.py (L490)
[^2]: c63660d56d/app/main/views/dashboard.py (L350)
2022-04-27 15:57:34 +01:00
Ben Thorner
cd84928a1e Add costs to each row in yearly usage API
This will replace the manual calculations in Admin [^1][^2] for SMS
and also in API [^3] for annual letter costs.

Doing the calculation here also means we correctly attribute free
allowance to the earliest rows in the billing table - Admin doesn't
know when a given rate was applied so can't do this without making
assumptions about when we change our rates.

Since the calculation now depends on annual billing, we need to
change all the tests to make sure a suitable row exists. I've also
adjusted the test data to match the assumption that there can only
be one SMS rate per bst_date.

Note about "OVER" clause
========================

Using "rows=" ("ROWS BETWEEN") makes more sense than "range=" as
we want the remainder to be incremental within each group in a
"GROUP BY" clause, as well as between groups i.e

  # ROWS BETWEEN (arbitrary numbers to illustrate)
  date=2021-04-03, units=3, cost=3.29
  date=2021-04-03, units=2, cost=4.17
  date=2021-04-04, units=2, cost=5.10

  vs.

  # RANGE BETWEEN
  date=2021-04-03, units=3, cost=4.17
  date=2021-04-03, units=2, cost=4.17
  date=2021-04-04, units=2, cost=5.10

See [^4] for more details and examples.

[^1]: https://github.com/alphagov/notifications-admin/blob/master/app/templates/views/usage.html#L60
[^2]: 072c3b2079/app/billing/billing_schemas.py (L37)
[^3]: 474d7dfda8/app/templates/views/usage.html (L98)
[^4]: https://learnsql.com/blog/difference-between-rows-range-window-functions/
2022-04-27 15:57:33 +01:00
Ben Thorner
fc378fed96 Prepare to replace "billing_units" in usage APIs
There is no such thing as a "billing unit". The data this field
contained was also a confusing mixture of two types:

- For emails and letters, it was just "notifications_sent".

- For SMS, it was the "chargeable_units" (billable * multiplier).

This replaces the single, ambiguous "billing_units" field with
"chargeable_units" and "notifications_sent" in both usage APIs.
Once Admin is using them we can remove the old field.
2022-04-27 15:57:30 +01:00
Ben Thorner
80efdd2ec6 Refactor usage API queries into functions per type
This makes it easier to extend each function with costs and free
allowances - especially for SMS.

I've chosen to duplicate the "WHERE" clause in each subquery vs.
the top-level query. This will make more sense in later commits
where we start adding free allowance calculations, which need to
be done on a yearly basis - knowledge the subqueries should have.
2022-04-27 15:17:18 +01:00