Commit Graph

7804 Commits

Author SHA1 Message Date
David McDonald
f3ee2cdd48 Merge pull request #3086 from alphagov/lambda-errors
Failover to second lambda on error
2021-01-14 11:15:34 +00:00
David McDonald
3216a74fbd Don't try failover for canary
No point trying, it's the same lambda. As `_invoke_lambda` currently
takes a bytes payload, rather than a json payload, it meant the decision
between encoding the payload in the canary or moving the encoding into
the `_invoke_lambda` function. We decided to go for the former as the
lesser of two evils. We may end up doing the encoding twice in the case
of a failover but this avoids us having to put the encoding in our code
in several places (for example the canary and also soon to be the link
tests).
2021-01-14 11:00:38 +00:00
Pea Tyczynska
b5a33ded98 Retry with failover lambda for FunctionError and status > 299
For all FunctionErrors, and for invoke errors (status > 299) we
want to retry with failover lambda.

We are doing this, because if there is a connection or other error
with one lambda, the failover lambda may still work and it's
worth trying.

With time, we will probably have more complex retry flow, depending
on the error and even maybe differing for each MNO (broadcast provider).
2021-01-14 10:45:29 +00:00
David McDonald
fb5b05a983 Merge pull request #3089 from alphagov/everyday-2nd-class-alert
Alert on 2nd class letters still in sending everyday
2021-01-13 12:16:55 +00:00
Leo Hemsted
5ade5ba13f Merge pull request #3087 from alphagov/migrate-broadcast
add content to old broadcast messages with no content
2021-01-13 11:39:30 +00:00
David McDonald
c3ef23c771 Alert on 2nd class letters still in sending everyday
In 8285ef5f89
we turned off alerting on 2nd class letters still being in sending on
certain days of the week because we were only sending letters out on
Mon, Wed, Fri.

Now we have swapped back to sending out 2nd class letters on all
workdays so this change can be reverted. Note, I haven't reverted the
commit exactly but more so the behaviour, whilst leaving in some tests
to explicitly test 2nd class letters for the alert in case we change
this again.
2021-01-13 11:21:27 +00:00
Rebecca Law
4529b92e23 Merge pull request #3088 from alphagov/update-org-query
Change the sort order for the organisation usage page
2021-01-13 10:28:57 +00:00
Leo Hemsted
54495b4e14 add content to old broadcast messages with no content
new broadcast messages will have content filled whether they have a
tempalte or not, but old ones won't so populate.

Stole the session constructor from 0044_jos_to_notification_hist.py
2021-01-13 10:09:16 +00:00
Pea Tyczynska
1aff854afd Create logs for invoking and finishing lambda, and for retry.
Those logs will give us extra visibility into lambda invocation
process.
2021-01-12 15:34:48 +00:00
Pea Tyczynska
d7661abe81 Move variable used in tests to top of file for more DRY code 2021-01-12 15:34:47 +00:00
David McDonald
24f52721f3 Retry with second lambda if connection error
Note, we assume whenever there is a `FunctionError` that there will be a
payload that contains an `errorMessage` key. It's implied implicitely in
the docs but it's not very explicit.

https://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html#API_Invoke_ResponseSyntax
2021-01-12 15:34:47 +00:00
David McDonald
9da8e54d69 Define failover lambdas
We will need a lambda to failover to if the first lambda fails. This
isn't so much a case of the lambda itself failing, as it is a cross
availability zone resource automatically, it's more in case something in
the networking goes down in our AZ and therefore the lambda can't call
out to the CBC. In this case, we will be able to swap to using the
second AZ by calling the second lambda.
2021-01-12 15:34:47 +00:00
David McDonald
1e537d507b Make lambda_name abstract property
As we require all instances to have it
2021-01-12 15:34:46 +00:00
David McDonald
5a46662c28 Abstract invoking of lambda
This is to prepare us for where when we try and send/cancel a broadcast
we may need to invoke more than one lambda. This might happen if we call
the invoke the first lambda, we get an error and therefore we try and
invoke a failover/second lambda. Then `_invoke_lambda` will be
responsible for the call to AWS whereas `_invoke_lambda_with_failover`
will be responsible more for picking the lambda and deciding on retry
behaviour if failure cases.
2021-01-12 15:34:46 +00:00
Rebecca Law
e05e9bb5e0 Change the sort order for the organisation usage page.
Ensure the archived services are at the bottom of the list. The organisation trial mode page already sorts the archived services to the bottom.
2021-01-12 09:44:35 +00:00
Leo Hemsted
4980c3e0fa Merge pull request #3085 from alphagov/fix-broadcast-migration
Fix broadcast migration
2021-01-11 16:13:52 +00:00
Leo Hemsted
400dfe0217 allow broadcasts to have a template and no content
ensures code remains backwards compatible during the deploy. this commit
should be reverted once all broadcast_message.content fields have been
back-filled.
2021-01-11 15:56:40 +00:00
Leo Hemsted
91abe6d55f allow null content in migration
because existing rows won't have any content populated yet.
2021-01-11 15:56:11 +00:00
Leo Hemsted
a3184c53e9 Merge pull request #3084 from alphagov/broadcast-job-content
add content to broadcast_message and make template fields nullable
2021-01-11 14:44:09 +00:00
Leo Hemsted
2e929754ff add content to broadcast_message and make template fields nullable
we want to be able to create broadcast messages without templates. To
start with, these will come from the API, but in future we may want to
let people create via the admin interface without creating a template
too.

populate a non-nullable content field with the values supplied via the
template (or supplied directly if via api).
2021-01-08 18:58:17 +00:00
Sakis
88a6b7729e Merge pull request #3082 from alphagov/fix-sender-logging
Add disk space check for sender worker
2021-01-06 10:58:30 +02:00
sakisv
9bb9070ba0 Add disk space check for sender worker
Reused the existing `ensure_celery_is_running` function to terminate the
script
2021-01-04 14:01:19 +02:00
Chris Hill-Scott
d55b66a6d8 Merge pull request #3075 from alphagov/cache-provider-lookup
Cache provider lookups for 10 seconds
2021-01-04 10:00:25 +00:00
Leo Hemsted
386c3671bb Merge pull request #3073 from alphagov/pyup-scheduled-update-2020-12-23
Scheduled weekly dependency update for week 51
2020-12-31 14:37:36 +00:00
Leo Hemsted
4814c66c1d fix schema metaclasses
marshmallow v0.22.0 added load_instance and include_relationship
options, which we need to keep old ModelSchema code working
2020-12-31 14:13:05 +00:00
Leo Hemsted
a33ec5c7f1 remove deprecated ModelSchema class 2020-12-31 13:56:20 +00:00
Leo Hemsted
ee2bec2f72 pin marshmallow-sqlalchemy
to keep marshmallow <=3.0 dep
2020-12-31 13:56:18 +00:00
Leo Hemsted
156c7aa32a bump python client
brings in jwt2.0 compat
2020-12-31 13:56:04 +00:00
Leo Hemsted
1da16eda23 freeze reqs 2020-12-31 13:55:37 +00:00
pyup-bot
b298440f00 Update sqlalchemy from 1.3.20 to 1.3.22 2020-12-31 13:55:37 +00:00
pyup-bot
97d35b86b5 Update pyjwt from 1.7.1 to 2.0.0 2020-12-31 13:55:37 +00:00
pyup-bot
659a43e435 Update cachetools from 4.1.1 to 4.2.0 2020-12-31 13:55:37 +00:00
pyup-bot
e4c5633150 Update eventlet from 0.29.1 to 0.30.0 2020-12-31 13:55:37 +00:00
pyup-bot
0c0821b9f9 Update prometheus-client from 0.8.0 to 0.9.0 2020-12-31 13:55:37 +00:00
pyup-bot
39877e1e40 Update marshmallow-sqlalchemy from 0.23.1 to 0.24.1 2020-12-31 13:55:37 +00:00
pyup-bot
e560b4a972 Update flask-marshmallow from 0.11.0 to 0.14.0 2020-12-31 13:55:37 +00:00
pyup-bot
20994c2d5d Update cffi from 1.14.3 to 1.14.4 2020-12-31 13:55:37 +00:00
David McDonald
57f5bd76de Merge pull request #3081 from alphagov/ses-error-logs
SES error logs
2020-12-31 13:13:20 +00:00
Leo Hemsted
d470c928cd Merge pull request #3072 from alphagov/doc-dl-exc
handle doc dl connection errors correctly
2020-12-31 11:24:00 +00:00
David McDonald
56879d0d22 Make sure error message is logged as part of the exception 2020-12-31 11:08:09 +00:00
Chris Hill-Scott
8834377a5d Merge pull request #3074 from alphagov/serialise-process-type
Serialise process_type for template history
2020-12-31 09:54:00 +00:00
Chris Hill-Scott
624bd1d12e Make function-level setup fixture clear cache
This means that anyone adding a new test to this file doesn’t have to
remember to clear the cache in their test, or forget to and have a
hard-to-debug test failure.

Using `setup_function` means we don’t have to convert this module into
using class-based tests.
2020-12-31 09:37:07 +00:00
Chris Hill-Scott
55afc9a401 Increase provider lookup cache TTL to 10 seconds
Tested locally with TTL values of:
- 2 seconds
- 5 seconds
- 10 seconds

The benefit really started showing at 10 seconds, where >50% of lookups
hit the cache rather than the database.

For graphs see https://github.com/alphagov/notifications-api/pull/3075#issuecomment-750836404
2020-12-31 09:36:55 +00:00
David McDonald
977554781f Add better logging message for tech failure
So we can easily identify which notification ID failed
2020-12-30 17:28:21 +00:00
David McDonald
2480f91667 Raise better exception on InvalidParameterValue error
There are several reasons why we might get an `InvalidParameterValue`
from the SES API. One, as correctly identified before in
https://github.com/alphagov/notifications-api/pull/713/files
is if we allow an email address on our side that SES rejects.

However, there are other types of errors that could cause an
`InvalidParameterValue`. One example is a `Header too long: 'Subject'`
error that we have seen happen in production. This shouldn't raise an
`InvalidEmailError` as that is not appropriate.

Therefore, we introduce a new exception
`EmailClientNonRetryableException`, that represents any exception back
from an email client that we can use whenever we get a
`InvalidParameterValue` error.

Note, I chose `EmailClientNonRetryableException` rather than
`SESClientNonRetryableException` as our code needs to catch this
exception and it shouldn't be aware of what email client is being used,
it just needs to know that it came from one of the email clients (if in
time we have more than one).

In time, we may wish to extend the approach of having generic
`EmailClient` exceptions and `SMSClient` exceptions as this should be
the most extendable pattern and a good abstraction.
2020-12-30 17:18:16 +00:00
David McDonald
2079202160 Stop logging email addresses for SES errors
We shouldn't be logging PII so we should not log email addresses. We
remove the email address and just log the normal exception message.

Note, this meant before that you could see the email address and more
easily track down the notification ID in the database. Now instead, you
will need to search in the DB for notifications that have gone into
technical failure at the time of the log message (as we still don't
log the notification ID alongside the failure).
2020-12-30 17:18:15 +00:00
David McDonald
6a95925897 Merge pull request #3078 from alphagov/up-memory
Add more memory for the sender and letter workers
2020-12-29 15:50:49 +00:00
Sakis
5e08cc7bc6 Merge pull request #3076 from alphagov/fix-app-logging
Fix app logging
2020-12-24 18:55:58 +02:00
sakisv
1bfdac8417 Temporarily remove disk space check from multi_worker script
There seems to be some kind of complication in this script that doesn't
allow it to terminate properly.

This is being removed for now to allow deploying the rest of the fixes
in time for the holiday period.
2020-12-24 18:44:26 +02:00
Chris Hill-Scott
c64e935168 Merge pull request #3079 from alphagov/pass-language-to-lambda
Pass language through to lambda
2020-12-24 16:06:40 +00:00