notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2026-01-16 15:41:16 -05:00

Author	SHA1	Message	Date
Ben Thorner	a91fde2fda	Run auto-correct on app/ and tests/	2021-03-12 11:45:45 +00:00
Rebecca Law	19f7a6ce38	Refactor method for deciding the failure type	2021-03-10 14:39:55 +00:00
Leo Hemsted	90e82aff3e	properly log the lambda response correctly boto returns a `StreamingBody`[1] response rather than a json struct. We're currently just logging things like "Error calling lambda o2-1-proxy with function error <botocore.response.StreamingBody object at 0x7f74cd6e02e8>" which is obviously less than ideal. Also make the tests properly reflect this - annoyingly it appears like we can't use moto to reliably test this interface as the moto `mock_lambda` decorator needs you to be running inside a docker container?? [1] https://botocore.amazonaws.com/v1/documentation/api/latest/reference/response.html#botocore.response.StreamingBody	2021-02-18 11:51:38 +00:00
Katie Smith	6b8ebb3421	Fix linting errors	2021-02-16 09:03:38 +00:00
Leo Hemsted	62cf9f60a9	catch boto exceptions these will happen if, for example, you have issues connecting to AWS or permission issues. Still failover if we get one of these exceptions, as I think it might be possible to have a problem only related to one of the lambdas.	2021-02-12 19:48:32 +00:00
Leo Hemsted	4f89be6944	Revert "Merge pull request #3125 from alphagov/revert-retry" This reverts commit `6b9a50beff`, reversing changes made to `33f93dfea2`.	2021-02-09 17:01:04 +00:00
Leo Hemsted	bee0059e53	Revert "Merge pull request #3101 from alphagov/retry-broadcasts" This reverts commit `1bd99c779d`, reversing changes made to `d390eb2cac`.	2021-02-08 11:02:34 +00:00
Leo Hemsted	ac34fb9c05	retry sending broadcasts Retry tasks if they fail to send a broadcast event. Note that each task tries the regular proxy and the failover proxy for that provider. This runs a bit differently than our other retries: Retry with exponential backoff. Our other tasks retry with a fixed delay of 5 minutes between tries. If we can't send a broadcast, we want to try immediately. So instead, implement an exponential backoff (1, 2, 4, 8, ... seconds delay). We can't delay for longer than 310 seconds due to visibility timeout settings in SQS, so cap the delay at that amount. Normally we give up retrying after a set amount of retries (often 4 hours). As broadcast content is much more important than normal notifications, we don't ever want to give up on sending them to phones... ...UNLESS WE DO! Sometimes we do want to give up sending a broadcast though! Broadcasts have an expiry time, when they stop showing up on peoples devices, so if that has passed then we don't need to send the broadcast out. Broadcast events can also be superceded by updates or cancels. Check that the event is the most recent event for that broadcast message, if not, give up, as we don't want to accidentally send out two conflicting events for the same message.	2021-02-03 16:43:01 +00:00
David McDonald	2aad3163e6	Allow CBC proxy client to take channel This moves the hardcoding to test channels one step up to where we call `create_and_send_broadcast` We can then after this, start to differ whether we give it the 'test' or 'severe' channel based on the services channel setting.	2021-02-01 14:10:38 +00:00
David McDonald	a3d966056a	Merge pull request #3110 from alphagov/test-channel Set the default broadcast channel to test	2021-02-01 10:18:35 +00:00
David McDonald	f9b1d3d573	Set the default broadcast channel to test This used to be hardcoded in the CBC proxy but now we will hardcode it in the cbc_proxy_client. In a future PR we can start choosing which channel a broadcast will go to based on the channel configured for that broadcast service.	2021-01-27 15:27:11 +00:00
Katie Smith	2681752f15	Rename bt-ee-proxy to ee-proxy We want to rename the `bt-ee-1-proxy` lambda function to `ee-1-proxy`. This change will need to be deployed at the same time that we change the name of the lambda function in the Terraform.	2021-01-26 14:36:20 +00:00
Richard Baker	6256cdf792	Add proxy client for o2 cell croadcasting o2 use One-2-many CBC so we can use the O2M/CAP client. Once differences between CBCs have been worked out we can consolidate O2M clients to reduce duplication. Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>	2021-01-26 11:11:44 +00:00
David McDonald	ff193387d1	Add proxy client for Three Three uses the One 2 Many technology so should work in the same way as our proxy for EE	2021-01-14 11:44:46 +00:00
Pea Tyczynska	b5a33ded98	Retry with failover lambda for FunctionError and status > 299 For all FunctionErrors, and for invoke errors (status > 299) we want to retry with failover lambda. We are doing this, because if there is a connection or other error with one lambda, the failover lambda may still work and it's worth trying. With time, we will probably have more complex retry flow, depending on the error and even maybe differing for each MNO (broadcast provider).	2021-01-14 10:45:29 +00:00
Pea Tyczynska	d7661abe81	Move variable used in tests to top of file for more DRY code	2021-01-12 15:34:47 +00:00
David McDonald	24f52721f3	Retry with second lambda if connection error Note, we assume whenever there is a `FunctionError` that there will be a payload that contains an `errorMessage` key. It's implied implicitely in the docs but it's not very explicit. https://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html#API_Invoke_ResponseSyntax	2021-01-12 15:34:47 +00:00
David McDonald	57f5bd76de	Merge pull request #3081 from alphagov/ses-error-logs SES error logs	2020-12-31 13:13:20 +00:00
Leo Hemsted	d470c928cd	Merge pull request #3072 from alphagov/doc-dl-exc handle doc dl connection errors correctly	2020-12-31 11:24:00 +00:00
David McDonald	2480f91667	Raise better exception on InvalidParameterValue error There are several reasons why we might get an `InvalidParameterValue` from the SES API. One, as correctly identified before in https://github.com/alphagov/notifications-api/pull/713/files is if we allow an email address on our side that SES rejects. However, there are other types of errors that could cause an `InvalidParameterValue`. One example is a `Header too long: 'Subject'` error that we have seen happen in production. This shouldn't raise an `InvalidEmailError` as that is not appropriate. Therefore, we introduce a new exception `EmailClientNonRetryableException`, that represents any exception back from an email client that we can use whenever we get a `InvalidParameterValue` error. Note, I chose `EmailClientNonRetryableException` rather than `SESClientNonRetryableException` as our code needs to catch this exception and it shouldn't be aware of what email client is being used, it just needs to know that it came from one of the email clients (if in time we have more than one). In time, we may wish to extend the approach of having generic `EmailClient` exceptions and `SMSClient` exceptions as this should be the most extendable pattern and a good abstraction.	2020-12-30 17:18:16 +00:00
David McDonald	2079202160	Stop logging email addresses for SES errors We shouldn't be logging PII so we should not log email addresses. We remove the email address and just log the normal exception message. Note, this meant before that you could see the email address and more easily track down the notification ID in the database. Now instead, you will need to search in the DB for notifications that have gone into technical failure at the time of the log message (as we still don't log the notification ID alongside the failure).	2020-12-30 17:18:15 +00:00
Chris Hill-Scott	c3a1d5c506	Pass language through to lambda If we’re sending non-GSM characters, we need to mark the language in the XML as Welsh (`cy-GB` in CAP, `Welsh` in IBAG). Currently, the CBC proxy checks the content we’re sending, and then uses an approximation based on ASCII to determine whether we’re sending any non-GSM characters, and if so, sets the language appropriately. Instead, we should can functionality from the notifications-utils repo to determine the language. If any non-GSM characters are used, then the we can set the language to Welsh. We’ll need to update the proxy to look at this new language flag.	2020-12-24 15:15:32 +00:00
Leo Hemsted	325f271e25	handle doc dl connection errors correctly previously we'd see an error message in the logs: `AttributeError: 'NoneType' object has no attribute 'status_code'` because we were assuming the requests exception would always have a response - it won't have a response if it wasn't able to create a connection at all.	2020-12-23 12:21:24 +00:00
Pea Tyczynska	95deb5a52f	Move DATETIME_FORMAT from app to app.utils To avoid cyclical import issues	2020-12-18 17:39:35 +00:00
Pea Tyczynska	ee833bd65b	Fix cancel broadcast by converting reference date to string Datetime oobject is not json serializable, we have to convert it to string for the created_at field of previous broadcast provider messages.	2020-12-18 17:22:21 +00:00
Pea Tyczynska	4758d8c4cb	Format message_number for references In IBAG format for broadcasts, we need to give sequential number of previous message, and it needs to be formatted as a hex padded with zeroes to be 8 character long. This commit adds the necessary formatting.	2020-12-14 18:21:28 +00:00
Pea Tyczynska	35a212d907	Add cancel routes to cbc proxy clients Also clean the code up a bit.	2020-12-11 18:52:54 +00:00
Richard Baker	4dd37acecb	Set cbc proxy message_format to cap The CBC proxy lambda expects the message_format parameter to be one of `cap` or `ibag`. Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>	2020-12-09 17:10:40 +00:00
Pea Tyczynska	8af4b27fd6	Separate functions for cbc clients Also move message_format to the clients.	2020-12-09 11:13:50 +00:00
Pea Tyczynska	553565bc91	Send message format to CBC Either cap or ibag	2020-12-08 11:15:26 +00:00
Pea Tyczynska	932a09fe5b	Pass message_number to proxy clients	2020-12-07 13:13:12 +00:00
Leo Hemsted	e2fa0116a0	add CBC_PROXY_ENABLED config flag to control if tasks are triggered previously we made some incorrect assumptions about set-up on staging and prod - they currently don't have any cbc_proxy aws creds at all. We shoudn't be attempting canaries or link tests when there's no AWS infrastructure to connect to. We also shouldn't bother writing a row into the database at all for the broadcast_provider_message since we're not even attempting to send, and we shouldn't get confused between messages that failed and messages we never wanted to send at all.	2020-11-26 10:16:22 +00:00
Leo Hemsted	087cc5053d	separate cbc proxy into separate clients this is a pretty big and convoluted refactor unfortunately. Previously: There was one global `cbc_proxy_client` object in apps. This class has the information about how to invoke the bt-ee lambda, and handles all calls to lambda. This includes calls to the canary too (which is a separate lambda). The future: There's one global `cbc_proxy_client`. This knows about the different provider functions and lambdas, and you'll need to ask this client for a proxy for your chosen provider. call cbc_proxy_client.get_proxy('ee')` and it'll return you a proxy that knows what ee's lambda function is, how to transform any content in a way that is exclusive to ee, and in future how to parse any response from ee. The present: I also cleaned up some duplicate tests. I'm really not sure about the names of some of these variables - in particular `cbc_proxy_client` isn't a client - it's more of a java style factory, where you call a function on it to get the client of your choice.	2020-11-19 15:50:37 +00:00
Leo Hemsted	b72640bf5e	refactor cbc proxy and fix tests moved the lambda invocation to a separate function to keep DRY asserts on exception types need to be outside of with blocks, or they won't trip (as the exception will stop execution of the inner with block). the asserts were also the wrong way round so fixed that.	2020-11-17 13:35:04 +00:00
Toby Lorne	dd012d6831	client: cbc_proxy passes through sent/expires A BroadcastEvent knows when an event was sent and should expire We pass through these values directly to the CBC Proxy, because BroadcastEvent knows how they should be formatted Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-28 11:37:06 +00:00
Toby Lorne	7542709455	clients: cbc_proxy sends message_type When we ask the CBC Proxy to send a message, we should specify that we want to send a real message, when we want a real message We will do this by specifying the message_type which can have 4 types, 3 of which represent a real message: \| Name \| Effect \| \| ------ \| ------------------------ \| \| alert \| Create an alert \| \| update \| Update an existing alert \| \| cancel \| Cancel an existing alert \| \| test \| Send a link test \| We will use message_type to represent the table above Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Richard <richard.baker@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>	2020-10-27 15:24:02 +00:00
Toby Lorne	052de84c9e	clients: cbc_proxy client has canary method The CBC Proxy is essentially a lambda function which we invoke with various arguments. A way in which this can fail is that the notifications-api app invoking the function may not be able, any longer, to invoke the function. This could be caused by, for example: * an egress restriction preventing access to eu-west-2.lambda.amazonaws.com * a network partition preventing access to eu-west-2.lambda.amazonaws.com * the app's credentials have been rotated or revoked If we invoke a simple "canary" lambda function for which the app should have access to invoke, and check it for failures, we will know quickly if something is likely to be broken. This is especially important for cell broadcasts compared to email/SMS because we always have a baseline of traffic for email/SMS, and so any failure is observed almost immediately. This is not true for CB where we may expect to only see one CB message every week/month/quarter/year, as opposed to every minute or second for email/SMS. Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>	2020-10-26 17:14:08 +00:00
Toby Lorne	aa002afd31	clients: cbc_proxy actions accepts areas param related: https://github.com/alphagov/notifications-broadcasts-infra/pull/23 Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-23 17:09:00 +01:00
Toby Lorne	ff1ffc7fba	clients: cbc_proxy lambda client is unabbreviated for code clarity Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-22 12:22:11 +01:00
Toby Lorne	c9eb9c8622	clients: cbc_proxy client tests remove unused stmt Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-22 12:20:31 +01:00
Toby Lorne	62951fa039	clients: cbc_proxy tests for handling lambda response Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-20 15:26:27 +01:00
Toby Lorne	75de4abd47	clients: cbc_proxy handles lambda invoke response Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-20 15:18:11 +01:00
Toby Lorne	73507b3abc	clients: cbc_proxy invokes hardcoded function right now we are doing an end-to-end journey with a CBC from Notify (the CBE) and we would like to approve a broadcast in notify and have it appear on our test handset in order to do this, we: * hook up the lambda that we made in the correct VPC to cbc_proxy client * test that it is called correctly Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk> Co-authored-by: Katie <katie.smith@digital.cabinet-office.gov.uk>	2020-10-20 14:00:53 +01:00
Toby Lorne	ee79768d43	clients: cbc_proxy client uses _ld not _lambda _ld is better than _lambda because it causes primitive python syntax highlighting to not get confused _lambda is better than _ld because it is less jargon Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk> Co-authored-by: Katie <katie.smith@digital.cabinet-office.gov.uk>	2020-10-20 13:59:52 +01:00
Toby Lorne	14f8e7a5ff	clients: cbc_proxy client inits lambda client Using correct: * key id * secret key * region Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk> Co-authored-by: Katie <katie.smith@digital.cabinet-office.gov.uk>	2020-10-20 13:33:51 +01:00
David McDonald	36614e5492	Log warning for SES send rate throttling rather than exception We have hit throttling limits from SES approximately once a week during a spike of traffic from GOV.UK. The rate limiting usually only lasts a couple of minutes but generates enough exceptions to cause a p1 but with no potential action for the responder. Therefore we downgrade the warning for this case to a warning and assume traffic will level back out such that the problem resolves itself. Note, we will still get exceptions if we go over our daily limit, rather than our per minute sending limit, which does require immediate action by someone responding. If we were to continually go over our per second sending rate for a long continous period of time, then there is a chance we may not be aware but given the risk of this happening is low I think it's an acceptable risk for the moment.	2020-08-13 17:51:09 +01:00
Pea Tyczynska	c96142ba5e	Change function and variable names for readability and consistency	2020-06-01 12:44:49 +01:00
Pea Tyczynska	a4b942cf6c	Log detailed sms delivery status for mmg from process_sms_client_response task. Also log detailed delivery status for firetext in the same place in addition to it being logged from notifications_dao. Logging detailed delivery statuses will help us see why messages fail to deliver. In the future we could persist detailed delivery status in the database.	2020-06-01 12:44:49 +01:00
Leo Hemsted	99d008b383	handle document download errors properly if doc download returns a 403, that's a screw-up on our side. it's not helpful to a notify user for that to be passed on. the only thing they should care about is if it's a 400, because they uploaded a filetype we don't allow. Everything else should return 500 internal server error.	2020-01-20 13:44:50 +00:00
Katie Smith	8abe427cb7	Fix tests which call str() on exception messages Since Pytest 5, `ExceptionInfo` objects (returned by `pytest.raises`) now have the same `str` representation as `repr`. This means that `str(e)` now needs to be changed to `str(e.value)`. https://github.com/pytest-dev/pytest/issues/5412	2019-10-31 15:38:44 +00:00

1 2

94 Commits