notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2026-03-16 08:10:13 -04:00

Author	SHA1	Message	Date
David McDonald	20627d96ea	Put all broadcast tasks on the broadcast worker	2021-01-13 17:21:40 +00:00
David McDonald	78db0f9c2b	Add broadcasts worker and queue This worker will be responsible for handing all broadcasts tasks. It is based on the internal worker which is currently handling broadcast tasks. Concurrency of 2 has been chosen fairly arbitrarily. Gunicorn will be running 4 worker processes so we will end up with the ability to process 8 tasks per app instance given this.	2021-01-13 16:35:27 +00:00
David McDonald	fb5b05a983	Merge pull request #3089 from alphagov/everyday-2nd-class-alert Alert on 2nd class letters still in sending everyday	2021-01-13 12:16:55 +00:00
Leo Hemsted	5ade5ba13f	Merge pull request #3087 from alphagov/migrate-broadcast add content to old broadcast messages with no content	2021-01-13 11:39:30 +00:00
David McDonald	c3ef23c771	Alert on 2nd class letters still in sending everyday In `8285ef5f89` we turned off alerting on 2nd class letters still being in sending on certain days of the week because we were only sending letters out on Mon, Wed, Fri. Now we have swapped back to sending out 2nd class letters on all workdays so this change can be reverted. Note, I haven't reverted the commit exactly but more so the behaviour, whilst leaving in some tests to explicitly test 2nd class letters for the alert in case we change this again.	2021-01-13 11:21:27 +00:00
Rebecca Law	4529b92e23	Merge pull request #3088 from alphagov/update-org-query Change the sort order for the organisation usage page	2021-01-13 10:28:57 +00:00
Leo Hemsted	54495b4e14	add content to old broadcast messages with no content new broadcast messages will have content filled whether they have a tempalte or not, but old ones won't so populate. Stole the session constructor from 0044_jos_to_notification_hist.py	2021-01-13 10:09:16 +00:00
Rebecca Law	e05e9bb5e0	Change the sort order for the organisation usage page. Ensure the archived services are at the bottom of the list. The organisation trial mode page already sorts the archived services to the bottom.	2021-01-12 09:44:35 +00:00
Leo Hemsted	4980c3e0fa	Merge pull request #3085 from alphagov/fix-broadcast-migration Fix broadcast migration	2021-01-11 16:13:52 +00:00
Leo Hemsted	400dfe0217	allow broadcasts to have a template and no content ensures code remains backwards compatible during the deploy. this commit should be reverted once all broadcast_message.content fields have been back-filled.	2021-01-11 15:56:40 +00:00
Leo Hemsted	91abe6d55f	allow null content in migration because existing rows won't have any content populated yet.	2021-01-11 15:56:11 +00:00
Leo Hemsted	a3184c53e9	Merge pull request #3084 from alphagov/broadcast-job-content add content to broadcast_message and make template fields nullable	2021-01-11 14:44:09 +00:00
Leo Hemsted	2e929754ff	add content to broadcast_message and make template fields nullable we want to be able to create broadcast messages without templates. To start with, these will come from the API, but in future we may want to let people create via the admin interface without creating a template too. populate a non-nullable content field with the values supplied via the template (or supplied directly if via api).	2021-01-08 18:58:17 +00:00
Sakis	88a6b7729e	Merge pull request #3082 from alphagov/fix-sender-logging Add disk space check for sender worker	2021-01-06 10:58:30 +02:00
sakisv	9bb9070ba0	Add disk space check for sender worker Reused the existing `ensure_celery_is_running` function to terminate the script	2021-01-04 14:01:19 +02:00
Chris Hill-Scott	d55b66a6d8	Merge pull request #3075 from alphagov/cache-provider-lookup Cache provider lookups for 10 seconds	2021-01-04 10:00:25 +00:00
Leo Hemsted	386c3671bb	Merge pull request #3073 from alphagov/pyup-scheduled-update-2020-12-23 Scheduled weekly dependency update for week 51	2020-12-31 14:37:36 +00:00
Leo Hemsted	4814c66c1d	fix schema metaclasses marshmallow v0.22.0 added load_instance and include_relationship options, which we need to keep old ModelSchema code working	2020-12-31 14:13:05 +00:00
Leo Hemsted	a33ec5c7f1	remove deprecated ModelSchema class	2020-12-31 13:56:20 +00:00
Leo Hemsted	ee2bec2f72	pin marshmallow-sqlalchemy to keep marshmallow <=3.0 dep	2020-12-31 13:56:18 +00:00
Leo Hemsted	156c7aa32a	bump python client brings in jwt2.0 compat	2020-12-31 13:56:04 +00:00
Leo Hemsted	1da16eda23	freeze reqs	2020-12-31 13:55:37 +00:00
pyup-bot	b298440f00	Update sqlalchemy from 1.3.20 to 1.3.22	2020-12-31 13:55:37 +00:00
pyup-bot	97d35b86b5	Update pyjwt from 1.7.1 to 2.0.0	2020-12-31 13:55:37 +00:00
pyup-bot	659a43e435	Update cachetools from 4.1.1 to 4.2.0	2020-12-31 13:55:37 +00:00
pyup-bot	e4c5633150	Update eventlet from 0.29.1 to 0.30.0	2020-12-31 13:55:37 +00:00
pyup-bot	0c0821b9f9	Update prometheus-client from 0.8.0 to 0.9.0	2020-12-31 13:55:37 +00:00
pyup-bot	39877e1e40	Update marshmallow-sqlalchemy from 0.23.1 to 0.24.1	2020-12-31 13:55:37 +00:00
pyup-bot	e560b4a972	Update flask-marshmallow from 0.11.0 to 0.14.0	2020-12-31 13:55:37 +00:00
pyup-bot	20994c2d5d	Update cffi from 1.14.3 to 1.14.4	2020-12-31 13:55:37 +00:00
David McDonald	57f5bd76de	Merge pull request #3081 from alphagov/ses-error-logs SES error logs	2020-12-31 13:13:20 +00:00
Leo Hemsted	d470c928cd	Merge pull request #3072 from alphagov/doc-dl-exc handle doc dl connection errors correctly	2020-12-31 11:24:00 +00:00
David McDonald	56879d0d22	Make sure error message is logged as part of the exception	2020-12-31 11:08:09 +00:00
Chris Hill-Scott	8834377a5d	Merge pull request #3074 from alphagov/serialise-process-type Serialise process_type for template history	2020-12-31 09:54:00 +00:00
Chris Hill-Scott	624bd1d12e	Make function-level setup fixture clear cache This means that anyone adding a new test to this file doesn’t have to remember to clear the cache in their test, or forget to and have a hard-to-debug test failure. Using `setup_function` means we don’t have to convert this module into using class-based tests.	2020-12-31 09:37:07 +00:00
Chris Hill-Scott	55afc9a401	Increase provider lookup cache TTL to 10 seconds Tested locally with TTL values of: - 2 seconds - 5 seconds - 10 seconds The benefit really started showing at 10 seconds, where >50% of lookups hit the cache rather than the database. For graphs see https://github.com/alphagov/notifications-api/pull/3075#issuecomment-750836404	2020-12-31 09:36:55 +00:00
David McDonald	977554781f	Add better logging message for tech failure So we can easily identify which notification ID failed	2020-12-30 17:28:21 +00:00
David McDonald	2480f91667	Raise better exception on InvalidParameterValue error There are several reasons why we might get an `InvalidParameterValue` from the SES API. One, as correctly identified before in https://github.com/alphagov/notifications-api/pull/713/files is if we allow an email address on our side that SES rejects. However, there are other types of errors that could cause an `InvalidParameterValue`. One example is a `Header too long: 'Subject'` error that we have seen happen in production. This shouldn't raise an `InvalidEmailError` as that is not appropriate. Therefore, we introduce a new exception `EmailClientNonRetryableException`, that represents any exception back from an email client that we can use whenever we get a `InvalidParameterValue` error. Note, I chose `EmailClientNonRetryableException` rather than `SESClientNonRetryableException` as our code needs to catch this exception and it shouldn't be aware of what email client is being used, it just needs to know that it came from one of the email clients (if in time we have more than one). In time, we may wish to extend the approach of having generic `EmailClient` exceptions and `SMSClient` exceptions as this should be the most extendable pattern and a good abstraction.	2020-12-30 17:18:16 +00:00
David McDonald	2079202160	Stop logging email addresses for SES errors We shouldn't be logging PII so we should not log email addresses. We remove the email address and just log the normal exception message. Note, this meant before that you could see the email address and more easily track down the notification ID in the database. Now instead, you will need to search in the DB for notifications that have gone into technical failure at the time of the log message (as we still don't log the notification ID alongside the failure).	2020-12-30 17:18:15 +00:00
David McDonald	6a95925897	Merge pull request #3078 from alphagov/up-memory Add more memory for the sender and letter workers	2020-12-29 15:50:49 +00:00
Sakis	5e08cc7bc6	Merge pull request #3076 from alphagov/fix-app-logging Fix app logging	2020-12-24 18:55:58 +02:00
sakisv	1bfdac8417	Temporarily remove disk space check from multi_worker script There seems to be some kind of complication in this script that doesn't allow it to terminate properly. This is being removed for now to allow deploying the rest of the fixes in time for the holiday period.	2020-12-24 18:44:26 +02:00
Chris Hill-Scott	c64e935168	Merge pull request #3079 from alphagov/pass-language-to-lambda Pass language through to lambda	2020-12-24 16:06:40 +00:00
Chris Hill-Scott	9825469613	Make language attributes abstract properties This will make it impossible to create a new client without at least having to define these properties. Which should get someone thinking about language support…	2020-12-24 15:19:46 +00:00
Chris Hill-Scott	c3a1d5c506	Pass language through to lambda If we’re sending non-GSM characters, we need to mark the language in the XML as Welsh (`cy-GB` in CAP, `Welsh` in IBAG). Currently, the CBC proxy checks the content we’re sending, and then uses an approximation based on ASCII to determine whether we’re sending any non-GSM characters, and if so, sets the language appropriately. Instead, we should can functionality from the notifications-utils repo to determine the language. If any non-GSM characters are used, then the we can set the language to Welsh. We’ll need to update the proxy to look at this new language flag.	2020-12-24 15:15:32 +00:00
David McDonald	1ac3ca250c	Add more memory for the sender and letter workers On monday, we had a build of emails in the email queue that weren't getting picked up by the sender worker and causing delays. After further investigation with Andy from the PaaS, we believe the following happened. We received a bunch of traffic at 8:30ish which consisted of some very large emails in terms of their length and complexity. The amount of memory used by the app instances got very high and a few apps crashed due to OOM (recorded by 5 cf app event crashes). When new app instances tried to spin up, they weren't able to as they potentially also ran out of memory immediately. This left us in the position of having fewer app instances than we needed, on top of which they were all using a very large amount of CPU and may have been limited how quickly an individual app instance would process tasks. This meant that we were overall processing fewer tasks then we needed to and our queue of emails started to build up. So it appears our sender workers did not have the memory available that they needed. By looking at a graph for the past 30 days of memory usage on the sender workers, we see that it on several days breached 90% memory usage for long periods of time. This in combination of the hypothesis above of what happened leads us to decide that we want to give the app instances a bigger memory quota so it has been upped from 3GB to 4GB. Whilst doing, I also looked at long term memory usage graphs for our other workers and saw that the letters worker was similarly close to around 90% of memory used so have taken the opportunity to bump that too.	2020-12-24 15:03:39 +00:00
David McDonald	701c7ba80d	Merge pull request #3077 from alphagov/freezetime Fix test that fails after 5:30pm	2020-12-24 10:35:53 +00:00
David McDonald	9aba3d758b	Fix test that fails after 5:30pm Was failing when ran after 5:30pm as this would cause the letters to be in a different subfolder (for one day later). Solved by freezetiming it Example build that failed: https://cd.gds-reliability.engineering/builds/1876957	2020-12-24 09:57:52 +00:00
sakisv	a6ecfd66b6	Terminate instance if it's running out of disk space	2020-12-23 19:40:04 +02:00
sakisv	2108498eb1	Send worker-sender celery logs to /dev/null We are using our custom logger to log to `NOTIFY_LOG_PATH`, so this logging from celery is neither needed nor desired. We also need to define the location of the pidfiles, because of what appears to be a bug in celery where it uses the location of logs to infer the location of the pidfiles if it is not defined, i.e. in this case it was trying to find the pidfiles in `/dev/null/%N.pid`.	2020-12-23 19:39:56 +02:00

1 2 3 4 5 ...

7797 Commits