Commit Graph

8969 Commits

Author SHA1 Message Date
Jim Moffet
2bcae20441 initial sms provider cleanup 2022-06-25 13:05:10 -07:00
Christa Hartsock
b0f8a51f99 Move config into manifest, update for SNS provider
Keeps secrets in .env files
2022-06-23 13:45:58 -07:00
Christa Hartsock
7319708172 Use statsd local host instead of GDS 2022-06-23 13:39:05 -07:00
Christa Hartsock
3cd15e97e2 Update manifest to use proper templates for vars-file 2022-06-23 13:39:05 -07:00
Christa Hartsock
e773f937ed WIP: local deployment 2022-06-23 13:39:05 -07:00
Jim Moffet
5ff11b001d Merge pull request #5 from 18F/jim/061522/sns-integration
Jim/061522/sns integration
2022-06-23 13:17:31 -07:00
Jim Moffet
4030f9c8d7 config buckets 2022-06-23 13:05:09 -07:00
Jim Moffet
70143301ce Merge pull request #2 from 18F/local-dev-upgrades
add devcontainer configs and docker network orchestration
2022-06-17 11:23:11 -07:00
Jim Moffet
aa4ec532a4 implement SNS 2022-06-17 11:16:23 -07:00
Jim Moffet
79ba6cc1d1 disable cache persistence & env updates 2022-06-13 21:42:36 -07:00
Jim Moffet
e88e36712f sample env 2022-06-13 14:50:22 -07:00
Jim Moffet
f5beb23a1d rm deprecated config 2022-06-13 13:48:30 -07:00
Jim Moffet
60262d6031 config formatting 2022-06-13 13:45:07 -07:00
Jim Moffet
59b72f4853 add devcontainer configs and docker network orchestration 2022-06-13 13:16:32 -07:00
Ben Thorner
e604385e0c Merge pull request #3551 from alphagov/bump-utils-56
Bump utils to version 56.0.0
2022-06-01 16:02:17 +01:00
Ben Thorner
ee8e86f409 Bump utils to version 56.0.0
The only impactful change is the major version itself, where I've
fixed the breaking changes due to the upgrade of PyPDF2 [^1] and
checked there are no deprecation warnings when I run the tests.

[^1]: https://github.com/alphagov/notifications-utils/pull/973
2022-06-01 14:27:25 +01:00
Katie Smith
6495b192a6 Merge pull request #3550 from alphagov/rename-migration-file
Rename migration file
2022-05-27 15:25:05 +01:00
Katie Smith
4961c7cefc Rename migration file
This renames the latest migration file to match the Revision ID in the
file. When these names are different, our deployment pipeline tries to
run migrations on the notify-api-db-migration app and runs the
functional tests twice.
2022-05-27 14:58:01 +01:00
Katie Smith
67be0eed1e Merge pull request #3549 from alphagov/bump-dependencies
Upgrade marshmallow-sqlalchemy plus the unpinned dependencies
2022-05-27 14:19:27 +01:00
Katie Smith
d17be0354a Pin Werkzeug
We can't use version later than the one currently in requirements.in
because the version of flask-sqlalchemy that we are using won't work
with version 2.1.0 and above.
2022-05-27 09:57:10 +01:00
Katie Smith
8dd6d073d3 Run pip-compile --upgrade
This upgrades the sub-dependencies which we don’t pin in `requirements.txt`.
2022-05-27 09:57:10 +01:00
Katie Smith
4404f9eb12 Upgrade marshmallow-sqlalchemy from 0.23.1 to 0.28.0
This was blocked before due to being on marshmallow 2, but now that we
are on marshmallow 3 we can upgrade this package.
2022-05-26 14:18:51 +01:00
Ben Thorner
43dbc0891f Merge pull request #3546 from alphagov/notification-view-178125825
Use notification view for status / billing tasks
2022-05-26 11:03:38 +01:00
Ben Thorner
aa20064f3f Merge pull request #3545 from alphagov/remove-unused-function
Remove redundant DAO function / consolidate tests
2022-05-26 11:03:30 +01:00
Katie Smith
a6d25154d9 Merge pull request #3541 from alphagov/v3-marshmallow
Upgrade Marshmallow to version 3
2022-05-26 09:40:06 +01:00
Katie Smith
82862112fa Remove unused attributes on ServiceSchema
This removes the `override_flag` attribute, which we stopped using in #98cd838510979d467191685aac38ff2e287a92c5
2022-05-25 11:35:44 +01:00
Katie Smith
4ffdd32054 Keep the marshmallow v2 way of serializing DateTimes
Marshmallow v3 has changed the way that DateTimes get serialized
(https://marshmallow.readthedocs.io/en/stable/upgrading.html#datetime-leaves-timezone-information-untouched-during-serialization).

In order to avoid breaking anything, we want to keep the existing way of
handling DateTimes for now - this could be changed later. We can't just
pass a `format` argument to a DateTime field with the old format, which
looked like this `2017-09-19T00:00:00+00:00`. When we tried that,
Marshmallow then expected data that we are loading to also have that
format, which it doesn't.

This adds a new field, which serializes data in the old format but which
doesn't require data that is being deserialized to have such a precise
format.
2022-05-25 11:35:44 +01:00
Katie Smith
053397d8d4 Add deserialize method for ServiceSchema permissions field
Due to a difference in marshmallow 3, when calling
`service_schema.load()` with permissions data the permissions data was
being dropped. We could fix this by allowing the ServiceSchema to
include any unknown keys using but that have unexpected consequences.

Instead, this change adds a method so that the schema knows how to
deserialize permissions. This uses the same code that the `pre_load`
method uses, but there were errors when it wasn't included in both places.
2022-05-25 11:35:44 +01:00
Katie Smith
8e7f2615a9 Fix test assertion
This test was calling `.load` on model objects, when it should have
been calling `.dump`. This was not working as expected before the
marshmallow upgrade either - the objects returned were errors and not
template versions.
2022-05-25 11:35:44 +01:00
Katie Smith
21c943484d Change test that was failing due to new Marshmallow behaviour
Boolean fields in marshmallow have various values that get changed to
True or False. The value 'Yes' now gets changed to True,  which was
causing a test to start failing. We could change the schemas to stop
'Yes' from being changed to True, but the data for boolean fields comes
from admin, where it is only allowed to have certain values anyway so
this just fixes the test.
2022-05-25 11:35:44 +01:00
Katie Smith
57788e5da1 Add new schema, TemplateSchemaNested
When we nest the `TemplateSchema` as a field on the
`NotificationWithTemplateSchema`, we want to include the
`is_precompiled_letter` field. However, we don't want the
`is_precompiled_letter` in any of the other places that we use the
`TemplateSchema`.

The way we had the code was giving errors in version 3 of Marshmallow
since `is_precompiled_letter` was not defined on the `TemplateSchema`.
Instead of writing complicated logic around when the field should be
included or excluded, this adds a new schema which we can use in the one
place where we do want to include `is_precompiled_letter`.
2022-05-25 11:35:44 +01:00
Katie Smith
cfb6c9abc0 Make pre/post processors return modified data
https://marshmallow.readthedocs.io/en/stable/upgrading.html#pre-post-processors-must-return-modified-data

We had a few cases where the pre/post processors weren't returning
anything. They now need to return the modified data.
2022-05-25 11:35:44 +01:00
Katie Smith
ab2c33f1a3 Replace dump_to with data_key
https://marshmallow.readthedocs.io/en/stable/upgrading.html#load-from-and-dump-to-are-merged-into-data-key

No change to how things work here, just a kwarg that has been renamed.
2022-05-25 11:35:44 +01:00
Katie Smith
28e08b429d Keep behaviour of deserializing unknown keys
Keep the marshmallow 2 behaviour of dropping unknown keys
https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-raise-validationerror-when-deserializing-data-with-unknown-keys

The new default is to raise an error for unknown keys.

This also adds the optional fields that the `EmailDataSchema` can
have to the schema - `next` and `admin_base_url`.
2022-05-25 11:35:44 +01:00
Katie Smith
fc9b3bea1d Make the post and pre decorators take kwargs
https://marshmallow.readthedocs.io/en/stable/upgrading.html#decorated-methods-and-handle-error-receive-many-and-partial
2022-05-25 11:35:44 +01:00
Katie Smith
8ae2b0bb31 Replace how .dump is called
As with `.load`, only data is now returned instead of a tuple.
2022-05-25 11:35:44 +01:00
Katie Smith
bd4f74b359 Replace how .load is called
https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-are-always-strict

`.load` doesn't return a `(data, errors)` tuple any more - only data is
returned. A `ValidationError` is raised if validation fails. The code
now relies on the `marshmallow_validation_error` error handler to handle
errors instead of having to raise an `InvalidRequest`. This has no
effect on the response that is returned (a test has been modified to
check).

Also added a new `password` field to the `UserSchema` so that we don't
have to specially check for password errors in the `.create_user` endpoint
- we can let marshmallow handle them.
2022-05-25 11:35:44 +01:00
Ben Thorner
8e837cf681 Use table class directly instead of "table" var
In response to [^1].

[^1]: https://github.com/alphagov/notifications-api/pull/3546#discussion_r879541366
2022-05-24 10:16:28 +01:00
Ben Thorner
33645c7747 Use notification view for status / billing tasks
This fixes a bug where (letter) notifications left in sending would
temporarily get excluded from billing and status calculations once
the service retention period had elapsed, and then get included once
again when they finally get marked as delivered.*

Status and billing tasks shouldn't need to have knowledge about which
table their data is in and getting this wrong is the fundamental cause
of the bug here. Adding a view across both tables abstracts this away
while keeping the query complexity the same.

Using a view also has the added benefit that we no longer need to care
when the status / billing tasks run in comparison to the deletion task,
since we will retrieve the same data irrespective (see below for a more
detailed discussion on data integrity).

*Such a scenario is rare but has happened.

A New View
==========

I've included all the columns that are shared between the two tables,
even though only a subset are actually needed. Having extra columns
has no impact and may be useful in future.

Although the view isn't actually a table, SQLAlchemy appears to wrap
it without any issues, noting that the package doesn't have any direct
support for "view models". Because we're never inserting data, we don't
need most of the kwargs when defining columns.*

*Note that the "default" kwarg doesn't affect data that's retrieved,
only data that's written (if no value is set).

Data Integrity
==============

The (new) tests cover the main scenarios.

We need to be careful with how the view interacts with the deletion /
archiving task. There are two concerns here:

- Duplicates. The deletion task inserts before it deletes [^1], so we
could end up double counting. It turns out this isn't a problem because
a Postgres UNION is an implicit "DISTINCT" [^2]. I've also verified this
manually, just to be on the safe side.

- No data. It's conceivable that the query will check the history table
just before the insertion, then check the notifications table just after
the deletion. It turns out this isn't a problem either because the whole
query sees the same DB snapshot [^3][^4].*

*I can't think of a way to test this as it's a race condition, but I'm
confident the Postgres docs are accurate.

Performance
===========

I copied the relevant (non-PII) columns from Production for data going
back to 2022-04-01. I then ran several tests.

Queries using the new view still make use of indices on a per-table basis,
as the following query plan illustrates:

                                                                                          QUERY PLAN
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       GroupAggregate  (cost=1130820.02..1135353.89 rows=46502 width=97) (actual time=629.863..756.703 rows=72 loops=1)
         Group Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
         ->  Sort  (cost=1130820.02..1131401.28 rows=232506 width=85) (actual time=629.756..708.914 rows=217563 loops=1)
               Sort Key: notifications_all_time_view.template_id, notifications_all_time_view.sent_by, notifications_all_time_view.rate_multiplier, notifications_all_time_view.international
               Sort Method: external merge  Disk: 9320kB
               ->  Subquery Scan on notifications_all_time_view  (cost=1088506.43..1098969.20 rows=232506 width=85) (actual time=416.118..541.669 rows=217563 loops=1)
                     ->  Unique  (cost=1088506.43..1096644.14 rows=232506 width=725) (actual time=416.115..513.065 rows=217563 loops=1)
                           ->  Sort  (cost=1088506.43..1089087.70 rows=232506 width=725) (actual time=416.115..451.190 rows=217563 loops=1)
                                 Sort Key: notifications_no_pii.id, notifications_no_pii.job_id, notifications_no_pii.service_id, notifications_no_pii.template_id, notifications_no_pii.key_type, notifications_no_pii.billable_units, notifications_no_pii.notification_type, notifications_no_pii.created_at, notifications_no_pii.sent_by, notifications_no_pii.notification_status, notifications_no_pii.international, notifications_no_pii.rate_multiplier, notifications_no_pii.postage
                                 Sort Method: external merge  Disk: 23936kB
                                 ->  Append  (cost=114.42..918374.12 rows=232506 width=725) (actual time=2.051..298.229 rows=217563 loops=1)
                                       ->  Bitmap Heap Scan on notifications_no_pii  (cost=114.42..8557.55 rows=2042 width=113) (actual time=1.405..1.442 rows=0 loops=1)
                                             Recheck Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: ((key_type)::text = ANY ('{normal,team}'::text[]))
                                             ->  Bitmap Index Scan on ix_notifications_no_piiservice_id_composite  (cost=0.00..113.91 rows=2202 width=0) (actual time=1.402..1.439 rows=0 loops=1)
                                                   Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND (notification_type = 'sms'::notification_type) AND (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[])) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                       ->  Index Scan using ix_notifications_history_no_pii_service_id_composite on notifications_history_no_pii  (cost=0.70..906328.97 rows=230464 width=113) (actual time=0.645..281.612 rows=217563 loops=1)
                                             Index Cond: ((service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'::uuid) AND ((key_type)::text = ANY ('{normal,team}'::text[])) AND (notification_type = 'sms'::notification_type) AND (created_at >= '2022-05-01 23:00:00'::timestamp without time zone) AND (created_at < '2022-05-02 23:00:00'::timestamp without time zone))
                                             Filter: (notification_status = ANY ('{sending,sent,delivered,pending,temporary-failure,permanent-failure}'::text[]))
       Planning Time: 18.032 ms
       Execution Time: 759.001 ms
      (21 rows)

Queries using the new view appear to be slower than without, but the
differences I've seen are minimal: the original queries execute in
seconds locally and in Production, so it's not a big issue.

Notes: Performance
==================

I downloaded a minimal set of columns for testing:

      \copy (
        select
          id, notification_type, key_type, created_at, service_id,
          template_id, sent_by, rate_multiplier, international,
          billable_units, postage, job_id, notification_status
        from notifications
      ) to 'notifications.csv' delimiter ',' csv header;

      CREATE TABLE notifications_no_pii (
          id uuid NOT NULL,
          notification_type public.notification_type NOT NULL,
          key_type character varying(255) NOT NULL,
          created_at timestamp without time zone NOT NULL,
          service_id uuid,
          template_id uuid,
          sent_by character varying,
          rate_multiplier numeric,
          international boolean,
          billable_units integer NOT NULL,
          postage character varying,
          job_id uuid,
          notification_status text
      );

      copy notifications_no_pii	 from '/Users/ben.thorner/Desktop/notifications.csv' delimiter ',' csv header;

      CREATE INDEX ix_notifications_no_piicreated_at ON notifications_no_pii USING btree (created_at);
      CREATE INDEX ix_notifications_no_piijob_id ON notifications_no_pii USING btree (job_id);
      CREATE INDEX ix_notifications_no_piinotification_type_composite ON notifications_no_pii USING btree (notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piiservice_created_at ON notifications_no_pii USING btree (service_id, created_at);
      CREATE INDEX ix_notifications_no_piiservice_id_composite ON notifications_no_pii USING btree (service_id, notification_type, notification_status, created_at);
      CREATE INDEX ix_notifications_no_piitemplate_id ON notifications_no_pii USING btree (template_id);

And similarly for the history table. I then created a sepatate view
across both of these temporary tables using just these columns.

To test performance I created some queries that reflect what is run
by the billing [^5] and status [^6] tasks e.g.

      explain analyze select template_id, sent_by, rate_multiplier, international, sum(billable_units), count(*)
      from notifications_all_time_view
      where
      notification_status in ('sending', 'sent', 'delivered', 'pending', 'temporary-failure', 'permanent-failure')
      and key_type in ('normal', 'team')
      and created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      group by 1,2,3,4;

      explain analyze select template_id, job_id, key_type, notification_status, count(*)
      from notifications_all_time_view
      where created_at >= '2022-05-01 23:00'
      and created_at < '2022-05-02 23:00'
      and notification_type = 'sms'
      and service_id = 'c5956607-20b1-48b4-8983-85d11404e61f'
      and key_type in ('normal', 'team')
      group by 1,2,3,4;

Between running queries I restarted my local database and also ran
a command to purge disk caches [^7].

I tested on a few services:

- c5956607-20b1-48b4-8983-85d11404e61f on 2022-05-02 (high volume)
- 0cc696c6-b792-409d-99e9-64232f461b0f on 2022-04-06 (highest volume)
- 01135db6-7819-4121-8b97-4aa2d741e372 on 2022-04-14 (very low volume)

All execution results are of the same magnitude using the view compared
to the worst case of either table on its own.

[^1]: 00a04ebf54/app/dao/notifications_dao.py (L389)
[^2]: https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all
[^3]: https://www.postgresql.org/docs/current/transaction-iso.html
[^4]: https://dba.stackexchange.com/questions/210485/can-sub-selects-change-in-one-single-query-in-a-read-committed-transaction
[^5]: 00a04ebf54/app/dao/fact_billing_dao.py (L471)
[^6]: 00a04ebf54/app/dao/fact_notification_status_dao.py (L58)
[^7]: https://stackoverflow.com/questions/28845524/echo-3-proc-sys-vm-drop-caches-on-mac-osx
2022-05-19 15:14:32 +01:00
Ben Thorner
d153603c5c Remove redundant DAO function / consolidate tests
The tests were previously covering a shared function that's now
only used once, so I've inlined it and merged the tests together
with a common naming that's consistent with the code under test.
2022-05-19 14:12:28 +01:00
Katie Smith
906165eeb5 Remove strict = True from Marshmallow schemas
Since Marshmallow 3, schemas are now strict by default, and the
strict parameter has been removed. There were some schemas where
we had not specified `strict = True`, but we weren't relying on strict
being False, so this does not cause any issues.

https://marshmallow.readthedocs.io/en/stable/upgrading.html#schemas-are-always-strict
2022-05-19 13:46:49 +01:00
Katie Smith
53aae6f6cb Upgrade marshmallow from 2.21.0 to 3.15.0 2022-05-19 13:46:49 +01:00
Ben Thorner
648952eb38 Standardise tests for nightly billing sub-task 2022-05-19 13:40:00 +01:00
Ben Thorner
a9104a2ec3 Simplify test for status aggregation
We don't benefit from testing a third service and simplifying the
test will make it easier to incorporate anothere edge case later.
2022-05-19 13:39:46 +01:00
Pea Tyczynska
00a04ebf54 Merge pull request #3529 from alphagov/multiple_rates_sql_magic_for_org_view
Show correct sms cost in org view if sms rates change within financial year
2022-05-19 12:09:42 +01:00
Ben Thorner
2ad8c9d353 Merge pull request #3543 from alphagov/run-billing-for-longer-178125825
Recalculate billing rows for 10 days (prev. 4)
2022-05-19 11:27:28 +01:00
Pea Tyczynska
c4162748de Rename variable names for consistency between similar functions
Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>
2022-05-18 12:30:35 +01:00
Leo Hemsted
1c3793c5ac Ensure org billing tests have AnnualBilling for all services
we add a row in AnnualBilling table whenever a new service is created,
and our billing code assumes this is done. Yet when we were writing
some of the tests, this was not a thing yet, so now we are updating
those tests so they reflect our system well.

Co-authored-by: Pea Tyczynska <pea.tyczynska@digital.cabinet-office.gov.uk>
2022-05-18 12:30:24 +01:00
Pea Tyczynska
112c2ddf72 Use the new subquery in fetch_sms_billing_for_organisation()
This is so we have granular data about billable units and costs
so that we can handle multiple sms rates within one financial
year.

We also cast chargeable_units_used_so_far in that subquery
to integer so we don't have type mismatch.

Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>
2022-05-18 12:30:24 +01:00
Pea Tyczynska
150eaf019b Add query_organisation_sms_usage_for_year to help fetch sms totals for org
This is functionally very similar to query_service_sms_usage_for_year,
except this query filters by organisation and returns for all live services
within that organisation.
To ensure that the cumulative free allowance counter resets properly for
each service, we use the `partition_by` flag to group up the window
function[^1]. This magically handles all the free allowances
independently for each service.

[^1]: https://www.postgresql.org/docs/current/tutorial-window.html

Co-authored-by: Leo Hemsted <leo.hemsted@digital.cabinet-office.gov.uk>
2022-05-18 12:30:24 +01:00