notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2025-12-30 12:23:04 -05:00

Author	SHA1	Message	Date
Chris Hill-Scott	b3c69087d8	Serialise less stuff from the service object By default Marshallow includes unknown properties. This means every time a new property is added to the service model it gets included in the JSON-serialised response sent to the admin app. This is particuarly bad because it means that for returned letters the ID of every returned letter. So the JSON stored in Redis for the Check Your State Pension service is 86kb. Similarly the JSON stored in Redis for a big user of inbound text messaging is 458kb(!!!) because it has the ID of every received text message. That’s ~8,500 UUIDs. Luckily the admin app tells us exactly which keys it’s using here: `5952d9c26d/app/models/service.py (L31-L52)` ```python - `active` - `contact_link` - `email_branding` - `email_from` - `id` - `inbound_api` - `letter_branding` - `letter_contact_block` - `message_limit` - `name` - `prefix_sms` - `research_mode` - `service_callback_api` - `volume_email` - `volume_sms` - `volume_letter` - `consent_to_research` - `count_as_live` - `go_live_user` - `go_live_at` } ``` Plus these which it does not get automatically: - `email_branding` - `letter_branding` - `organisation` - `organisation_type` - `permissions` - `restricted` The API is returning all of these: - `active` - `all_template_folders` - `annual_billing` - `consent_to_research` - `contact_link` - `contact_list` - `count_as_live` - `created_by` - `crown` - `email_branding` - `email_from` - `go_live_at` - `go_live_user` - `id` - `inbound_api` - `inbound_number` - `inbound_sms` - `letter_branding` - `letter_contact_block` - `letter_logo_filename` - `message_limit` - `name` - `organisation` - `organisation_type` - `permissions` - `prefix_sms` - `rate_limit` - `research_mode` - `restricted` - `returned_letters` - `service_callback_api` - `users` - `version` - `volume_email` - `volume_letter` - `volume_sms` - `whitelist` So the ones that the admin is getting but not expecting are: - `all_template_folders` - `annual_billing` - `contact_list` - `created_by` - `crown` - `inbound_number` - `inbound_sms` - `letter_logo_filename` - `rate_limit` - `returned_letters` - `users` - `version` - `whitelist` Which is what this PR adds to the exclude list, except for `created_by` which is keeps because it’s needed to validate the JSON provided when creating a service.	2020-06-16 16:47:56 +01:00
Leo Hemsted	58ab99d74b	add more prometheus metrics Two new metrics: auth_db_connection_duration_seconds (histogram) wraps the first DB call of post notifications. This includes waiting to get a connection from the pool, and also making the actual request to the db to retrieve the service and api keys. (i'm not sure there's an easy way to separate these two things) post_notification_json_parse_duration_seconds wraps parsing the v2 post notifications json parsing and schema validation. Shouldn't include any async code	2020-06-15 16:26:56 +01:00
David McDonald	8b4a424df1	Tidy up	2020-06-12 16:51:44 +01:00
Leo Hemsted	15ce9fe3f9	add metrics for redis timings	2020-06-12 14:52:22 +01:00
Leo Hemsted	cd9b80f415	set test_errors app fixture to session scope we have one global metrics variable `metrics = GDSMetrics()`, and we then call `metrics.init_app` from within the flask application set up. The v2/test_errors.py app_for_test fixture calls create_app, would call metrics.init_app multiple times for the same metrics instance. This causes errors, so change the fixture to session level so it only calls once per test run.	2020-06-12 14:52:22 +01:00
Leo Hemsted	c4dc0f64c5	dd sqlalchemy connection metrics for celery tasks grab the worker app name and task name rather than the web host and endpoint. also add a fallback for if we're not in a web request or a celery task. I think that'll probably happen when we use alembic, or if we do things from within flask shell	2020-06-12 14:52:22 +01:00
Leo Hemsted	6e32ca5996	add prometheus metrics for connection pools (both web and sql) add the following prometheus metrics to keep track of general app instance health. # sqs_apply_async_duration how long does the actual SQS call (a standard web request to AWS) take. a histogram with default bucket sizes, split up per task that was created. # concurrent_web_request_count how many web requests is this app currently serving. this is split up per process, so we'd expect multiple responses per app instance # db_connection_total_connected how many connections does this app (process) have open to the database. They might be idle. # db_connection_total_checked_out how many connections does this app (process) have open that are currently in use by a web worker # db_connection_open_duration_seconds a histogram per endpoint of how long the db connection was taken from the pool for. won't have any data if a connection was never opened.	2020-06-12 14:52:22 +01:00
Leo Hemsted	bd433ad24f	bump utils to fix statsd timing bug statsd timing should always be in seconds, not milliseconds	2020-06-11 15:01:15 +01:00
Pea Tyczynska	bb3672100f	Turn off email stub on staging To check if it was our congestion point for the soak test. (Email stub is a bit slower to respond than Amazon SES, and the difference could lead to us having to many connections to db open as we wait for response from the stub)	2020-06-09 11:21:48 +01:00
Pea Tyczynska	2aa6470817	Point API for staging at email and sms stubs for the soak tests. This is done to avoid sending real email and sms and incurring unnecessary charges while we run the soak tests.	2020-06-08 17:14:16 +01:00
David McDonald	7bc02ac26e	Add better logging to understand callback delivery cases In particular, we want to know how often callbacks arrive before the notification being persisted	2020-06-04 16:00:43 +01:00
Pea Tyczynska	8b7a0b88cb	Ensure that aws ses stub client is not run in production	2020-06-04 15:43:47 +01:00
Pea Tyczynska	6422a88c8c	Stub SES email client to avoid hitting SES during load testing If we set an environment variable, we can stub out calls to SES and send them to our own stub app. If the environment variable is not set, things work as normal. To be used alongside https://github.com/alphagov/notifications-email-provider-stub	2020-06-03 11:11:43 +01:00
Pea M. Tyczynska	db040c40b9	Merge pull request #2860 from alphagov/put-status-codes-in-logs Put status codes in logs	2020-06-02 16:06:05 +01:00
Pea Tyczynska	b81c7dd5ee	Put status codes in logs to see if lack of detailed status is us not recognising a code or provider not having sent the detailed status. It seems like Firetext is sometimes sending us permanent-failure without detailed status. It could be due to: - them really not sending any detailed status - them sending a status code we don't recognise - them sending 000 code that means 'no errors', which we ignore To see which one it is, and to debug such issues quicker in the future, this PR adds status and detailed status codes to the logs.	2020-06-02 13:26:41 +01:00
Katie Smith	66dc6b046f	Merge pull request #2861 from alphagov/provider-split Moving SMS supplier resting point to 50/50	2020-06-02 12:41:26 +01:00
Pete Herlihy	8a67e14e2b	Moving SMS supplier resting point to 50/50	2020-06-02 12:27:14 +01:00
Pea M. Tyczynska	9f816ad5f5	Merge pull request #2856 from alphagov/mmg_detailed_response Capture detailed delivery receipt status from MMG	2020-06-01 15:23:53 +01:00
Pea Tyczynska	c96142ba5e	Change function and variable names for readability and consistency	2020-06-01 12:44:49 +01:00
Pea Tyczynska	a4b942cf6c	Log detailed sms delivery status for mmg from process_sms_client_response task. Also log detailed delivery status for firetext in the same place in addition to it being logged from notifications_dao. Logging detailed delivery statuses will help us see why messages fail to deliver. In the future we could persist detailed delivery status in the database.	2020-06-01 12:44:49 +01:00
Leo Hemsted	f2f2509c9b	add raw request timings to provider send functions we're using statsd to monitor how long provider requests are taking. However, there's lots of busy work that happens inside our statsd metrics timing window. Things like json dumping and loading, building headers, exception handling, etc. for firetext/mmg, the response object from requests has an elapsed property [1], which captures from sending raw data to parsing the response headers. for ses, it's a bit trickier, but boto3 exposes a few event hooks [2]. it's hard to find them without stepping through the code, but the interesting ones are before-call, after-call, after-call-error, request-created, and response-received. The before-call and after-call involve some marshalling, built-in retrying, etc, while request-created and response-received are much lower level. They might be called more than once per ses request, if boto3 itself retries the request on 5xx, 429 and low level socket errors [3]. Add these as new `raw-request-time` metrics rather than overwriting to avoid changing the meaning of an existing metric, and to let us compare the metrics to see if there's a noticeable difference at all [1] https://requests.readthedocs.io/en/master/api/#requests.Response.elapsed [2] https://boto3.amazonaws.com/v1/documentation/api/latest/guide/events.html [3] https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#legacy-retry-mode	2020-05-29 14:04:46 +01:00
Katie Smith	29caa362a3	Merge pull request #2851 from alphagov/archive-service Allow service to be archived if service with the same name has already been archived	2020-05-22 11:12:32 +01:00
Katie Smith	f22483a1ab	Delete unused Marshmallow schemas	2020-05-22 10:49:59 +01:00
Katie Smith	64cd8f39c2	Add the date to the service name and email_reply_to when archiving This copies what we do to a user's email address when archiving the user by prefixing it with `_archived_{date}`. We already prefixed the service name and email_reply_to with `_archived`, but this didn't allow a service with the same name to be archived more than once.	2020-05-22 09:37:45 +01:00
Katie Smith	13f7fecd5b	Move function to get archived email address value This function will be used when archiving services too, so it has been renamed and moved to `app/utils.py`.	2020-05-22 09:36:07 +01:00
Katie Smith	0b28766442	Reverts the new postage constraints Reverts https://github.com/alphagov/notifications-api/pull/2843 and https://github.com/alphagov/notifications-api/pull/2848	2020-05-20 18:31:25 +01:00
Katie Smith	4116affe7f	Merge pull request #2843 from alphagov/update-postage-constraint-take-2 Update postage constraint (take 2)	2020-05-20 14:41:44 +01:00
Chris Hill-Scott	95a779c649	Merge pull request #2841 from alphagov/jobs-by-contact-list Allow jobs to be grouped by contact list	2020-05-20 11:49:25 +01:00
Katie Smith	6d89b01f1e	Update JSON schema postage validation for new values	2020-05-19 16:04:36 +01:00
Katie Smith	7fd52017d0	Update postage db constraints for international letters The `notifications`, `notification_history`, `templates` and `templates_history` tables all had a check constraint on the postage column which specified that the postage had to be `first` or `second` if the notification or template was a letter. We now have two more options for postage - `europe` and `rest-of-world`. It's not possible to alter a check constraint, so the constraints have to be dropped then recreated. We are not recreating the constraint on the `notification_history` table since values here are always copied from the `notifications` table. The constraints get added as `NOT VALID` at first - this stage will lock the tables, so updating the `notification` table and `templates` and `templates_history` are done in separate migrations so that we don't lock all tables at the same time. In a third migration we then run `VALIDATE CONSTRAINT` for all tables - this will lock a row at a time, not the whole table.	2020-05-19 16:04:36 +01:00
Chris Hill-Scott	c7f914122a	Merge pull request #2839 from alphagov/group-letter-uploads Group letter uploads by printing day	2020-05-19 11:05:14 +01:00
David McDonald	2f0b3a9636	Fix edge case in func test data purging for created_by_id When running the purge command I found about 4 users who could not be deleted because their user id was still referenced in the services table as they had created the service yet they were not a member of that service anymore. I have fixed this by checking that if they are not a member but created the service then we also delete the service for them. Note, I've followed the previous convention of no tests for this function. I've run it locally and executed the code path so there should be no major flaws in the code. There is a small chance I wasn't able to exactly replicate the state that existed in preview on my local but hopefully it was close enough to be accurate.	2020-05-18 10:30:28 +01:00
David McDonald	df5ccae4c5	Add in positive logging case for purge command This is useful so we can see that it's doing things, which case is being hit and know that an empty terminal for an hour isn't a bad thing	2020-05-15 17:34:30 +01:00
David McDonald	dbb2dfa502	Merge pull request #2836 from alixedi/add-csv-support Add support for CSV files	2020-05-15 12:16:16 +01:00
Rebecca Law	aecf17fef1	This morning we raise a ParseError for a bad date format for the DateReceived attribute on the /notifications/receive/mmg request. This PR tries to parse the date, if that throws an error return now as the datereceived. This will at least allow the message to be persisted. Typically the DateReceived, provider_date, and the created_at date in the inbound_sms table are within a second of each other.	2020-05-13 10:37:41 +01:00
Chris Hill-Scott	cce153eee8	Return 400 for bad date argument It’s more of a bad request, because the input is bad, rather than something on our side being not found.	2020-05-13 08:56:54 +01:00
Chris Hill-Scott	c61f7e70c2	Add comments to explain time intervals	2020-05-13 08:53:40 +01:00
Chris Hill-Scott	c8cd3c2b70	Return template name in job response If you’ve sent a bunch of jobs from the same contact list then a handy way to differentiate between them will be date sent, but also template name (in effect the message you sent). This commit extends the job response to include template name, using the same pattern as for template type.	2020-05-12 13:10:39 +01:00
Chris Hill-Scott	3ed1700231	Count how many times a contact list has been used Because we’ll be grouping jobs under their parent contact lists it will be useful to have a way of showing how many times a contact list has been used. This will give the right information scent to indicate that clicking into a contact list is where you go to see its jobs. This means that the API needs to return a count of jobs for each contact list. Putting this code feels very non-idiomatic for our API. So suggestions about how to better architect it welcome…	2020-05-12 13:00:54 +01:00
Chris Hill-Scott	18ffccf8c9	Allow jobs to be filtered by contact list Rather than showing all jobs that have been ‘copied’ from a contact list I think it makes more sense to group them under the contact list. This way it’s easier to see what messages have been sent to a given group of contacts over time. Part of this work means the API needs to return only jobs that have been created from a given contact list, when asked.	2020-05-12 12:58:39 +01:00
Chris Hill-Scott	27a0ba1a65	Reformat arguments for readability We want to add another argument here, and doing so would make the line length too long with all the arguments on one line. Also uses the * operator to enforce keyword-only arguments.	2020-05-12 12:57:54 +01:00
Pea Tyczynska	5d6f2da155	Rename task from create_letters_pdf to get_pdf_for_templated_letter In a separate PR we will have to delete vestigial create_letters_pdf tasks that now only redirects to get_pdf_for_templated_letter.	2020-05-11 13:33:05 +01:00
Pea Tyczynska	879d15b736	Test logging and error message	2020-05-11 13:33:04 +01:00
Pea Tyczynska	3a00c19390	Polish and test the small task that updates billable units for letter	2020-05-11 13:32:09 +01:00
Pea Tyczynska	24a89c1c19	Modify tasks for getting letter pdf and updating billable units So that they talk with new template preview task for pdf creation	2020-05-11 13:30:59 +01:00
Chris Hill-Scott	864c6772b3	And an endpoint to get uploaded letters for a day Because we won’t be showing uploaded letters individually on the uploads page any more we need a way of listing them. This should be by printing day, to match how we’re grouping them on the uploads page. The response matches a normal `get_notifications` response so we can reuse the same code in the admin app.	2020-05-11 10:51:40 +01:00
Chris Hill-Scott	421c1aac96	Group uploaded letters by day of printing Some teams have started uploading quite a lot of letters (in the hundreds per week). They’re also uploading CSVs of emails. This means the uploads page ends up quite jumbled. This is because: - there’s just a lot of items to scan through - conceptually it’s a bit odd to have batches of things displayed alongside individual things on the same page So instead this commit starts grouping together uploaded letters. It does this by the date on which we ‘start’ printing them, or in other words the time at which they can no longer be cancelled. This feels like a natural grouping, and it matches what we know about people’s mental models of ‘batches’ and ‘runs’ when talking about printing. The code for this is a bit gnarly because: - timezones - the print cutoff doesn’t align with the end of a day - we have to do this in SQL because it wouldn’t be efficient to query thousands of letters and then do the timezone calculations on them in Python	2020-05-11 10:51:33 +01:00
Ali Zaidi	642ab1ad1e	Add support for CSV files	2020-05-06 14:11:50 +01:00
Chris Hill-Scott	80fc5e6600	Paginate search results for notifications The standard way that we indicate that there are more results than can be returned is by paginating. So even though we don’t intend to paginate the search results in the admin app, it can still use the presence or absence of a ‘next’ link to determine whether or not to show a message about only showing the first 50 results.	2020-05-06 12:13:00 +01:00
Chris Hill-Scott	625aad97c9	Limit search by recipient to 50 results Things could get ugly if you use a short search string on a service with lots of notifications…	2020-05-06 11:44:35 +01:00

1 2 3 4 5 ...

4075 Commits