notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2025-12-10 23:32:27 -05:00

Author	SHA1	Message	Date
Leo Hemsted	ac34fb9c05	retry sending broadcasts Retry tasks if they fail to send a broadcast event. Note that each task tries the regular proxy and the failover proxy for that provider. This runs a bit differently than our other retries: Retry with exponential backoff. Our other tasks retry with a fixed delay of 5 minutes between tries. If we can't send a broadcast, we want to try immediately. So instead, implement an exponential backoff (1, 2, 4, 8, ... seconds delay). We can't delay for longer than 310 seconds due to visibility timeout settings in SQS, so cap the delay at that amount. Normally we give up retrying after a set amount of retries (often 4 hours). As broadcast content is much more important than normal notifications, we don't ever want to give up on sending them to phones... ...UNLESS WE DO! Sometimes we do want to give up sending a broadcast though! Broadcasts have an expiry time, when they stop showing up on peoples devices, so if that has passed then we don't need to send the broadcast out. Broadcast events can also be superceded by updates or cancels. Check that the event is the most recent event for that broadcast message, if not, give up, as we don't want to accidentally send out two conflicting events for the same message.	2021-02-03 16:43:01 +00:00
David McDonald	a46b8c3bba	Remove redundant comment	2021-02-01 14:10:39 +00:00
David McDonald	2aad3163e6	Allow CBC proxy client to take channel This moves the hardcoding to test channels one step up to where we call `create_and_send_broadcast` We can then after this, start to differ whether we give it the 'test' or 'severe' channel based on the services channel setting.	2021-02-01 14:10:38 +00:00
David McDonald	a3d966056a	Merge pull request #3110 from alphagov/test-channel Set the default broadcast channel to test	2021-02-01 10:18:35 +00:00
David McDonald	86ea89cf76	Merge pull request #3098 from alphagov/downgrade-to-warning Downgrade SMS provider request exceptions to warnings	2021-01-29 11:52:10 +00:00
David McDonald	f9b1d3d573	Set the default broadcast channel to test This used to be hardcoded in the CBC proxy but now we will hardcode it in the cbc_proxy_client. In a future PR we can start choosing which channel a broadcast will go to based on the channel configured for that broadcast service.	2021-01-27 15:27:11 +00:00
Katie Smith	2681752f15	Rename bt-ee-proxy to ee-proxy We want to rename the `bt-ee-1-proxy` lambda function to `ee-1-proxy`. This change will need to be deployed at the same time that we change the name of the lambda function in the Terraform.	2021-01-26 14:36:20 +00:00
Richard Baker	6256cdf792	Add proxy client for o2 cell croadcasting o2 use One-2-many CBC so we can use the O2M/CAP client. Once differences between CBCs have been worked out we can consolidate O2M clients to reduce duplication. Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>	2021-01-26 11:11:44 +00:00
David McDonald	ac6837cde5	Downgrade exception to warning for provider API call When we send an HTTP request to our SMS providers, there is a chance we get a 5xx status code back from them. Currently we log this as two different exception level logs. If a provider has a funny few minutes, we could end up with hundreds of exceptions thrown and pagerduty waking someone up in the middle of the night. These problems tend to pretty quickly fix themselves as we balance traffic from one SMS to the other SMS provider within 5 minutes. By downgrading both exceptions to warning in the case of a `SmsClientResponseException`, we will reduce the change of waking us up in the middle of the night for no reason. If the error is not a `SmsClientResponseException`, then we will still log at the exception level as before as this is more unexpected and we may want to be alerted sooner. What we still want to happen though is that let's say both SMS providers went down at the same time for 1 hour. We don't want our tasks to just sit there, retrying every 5 minutes for the whole time without us being aware (so we can at least raise a statuspage update). Luckily we will still be alerted because our smoke tests will fail after 10 minutes and raise a p1: https://github.com/alphagov/notifications-functional-tests/blob/master/tests/functional/staging_and_prod/notify_api/test_notify_api_sms.py#L21	2021-01-18 17:00:21 +00:00
David McDonald	b9ec70acc2	Fix incorrect log line Should have been `lambda_name` not `self.lambda_name`.	2021-01-15 16:48:04 +00:00
David McDonald	ff193387d1	Add proxy client for Three Three uses the One 2 Many technology so should work in the same way as our proxy for EE	2021-01-14 11:44:46 +00:00
David McDonald	3216a74fbd	Don't try failover for canary No point trying, it's the same lambda. As `_invoke_lambda` currently takes a bytes payload, rather than a json payload, it meant the decision between encoding the payload in the canary or moving the encoding into the `_invoke_lambda` function. We decided to go for the former as the lesser of two evils. We may end up doing the encoding twice in the case of a failover but this avoids us having to put the encoding in our code in several places (for example the canary and also soon to be the link tests).	2021-01-14 11:00:38 +00:00
Pea Tyczynska	b5a33ded98	Retry with failover lambda for FunctionError and status > 299 For all FunctionErrors, and for invoke errors (status > 299) we want to retry with failover lambda. We are doing this, because if there is a connection or other error with one lambda, the failover lambda may still work and it's worth trying. With time, we will probably have more complex retry flow, depending on the error and even maybe differing for each MNO (broadcast provider).	2021-01-14 10:45:29 +00:00
Pea Tyczynska	1aff854afd	Create logs for invoking and finishing lambda, and for retry. Those logs will give us extra visibility into lambda invocation process.	2021-01-12 15:34:48 +00:00
David McDonald	24f52721f3	Retry with second lambda if connection error Note, we assume whenever there is a `FunctionError` that there will be a payload that contains an `errorMessage` key. It's implied implicitely in the docs but it's not very explicit. https://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html#API_Invoke_ResponseSyntax	2021-01-12 15:34:47 +00:00
David McDonald	9da8e54d69	Define failover lambdas We will need a lambda to failover to if the first lambda fails. This isn't so much a case of the lambda itself failing, as it is a cross availability zone resource automatically, it's more in case something in the networking goes down in our AZ and therefore the lambda can't call out to the CBC. In this case, we will be able to swap to using the second AZ by calling the second lambda.	2021-01-12 15:34:47 +00:00
David McDonald	1e537d507b	Make lambda_name abstract property As we require all instances to have it	2021-01-12 15:34:46 +00:00
David McDonald	5a46662c28	Abstract invoking of lambda This is to prepare us for where when we try and send/cancel a broadcast we may need to invoke more than one lambda. This might happen if we call the invoke the first lambda, we get an error and therefore we try and invoke a failover/second lambda. Then `_invoke_lambda` will be responsible for the call to AWS whereas `_invoke_lambda_with_failover` will be responsible more for picking the lambda and deciding on retry behaviour if failure cases.	2021-01-12 15:34:46 +00:00
David McDonald	57f5bd76de	Merge pull request #3081 from alphagov/ses-error-logs SES error logs	2020-12-31 13:13:20 +00:00
Leo Hemsted	d470c928cd	Merge pull request #3072 from alphagov/doc-dl-exc handle doc dl connection errors correctly	2020-12-31 11:24:00 +00:00
David McDonald	56879d0d22	Make sure error message is logged as part of the exception	2020-12-31 11:08:09 +00:00
David McDonald	2480f91667	Raise better exception on InvalidParameterValue error There are several reasons why we might get an `InvalidParameterValue` from the SES API. One, as correctly identified before in https://github.com/alphagov/notifications-api/pull/713/files is if we allow an email address on our side that SES rejects. However, there are other types of errors that could cause an `InvalidParameterValue`. One example is a `Header too long: 'Subject'` error that we have seen happen in production. This shouldn't raise an `InvalidEmailError` as that is not appropriate. Therefore, we introduce a new exception `EmailClientNonRetryableException`, that represents any exception back from an email client that we can use whenever we get a `InvalidParameterValue` error. Note, I chose `EmailClientNonRetryableException` rather than `SESClientNonRetryableException` as our code needs to catch this exception and it shouldn't be aware of what email client is being used, it just needs to know that it came from one of the email clients (if in time we have more than one). In time, we may wish to extend the approach of having generic `EmailClient` exceptions and `SMSClient` exceptions as this should be the most extendable pattern and a good abstraction.	2020-12-30 17:18:16 +00:00
David McDonald	2079202160	Stop logging email addresses for SES errors We shouldn't be logging PII so we should not log email addresses. We remove the email address and just log the normal exception message. Note, this meant before that you could see the email address and more easily track down the notification ID in the database. Now instead, you will need to search in the DB for notifications that have gone into technical failure at the time of the log message (as we still don't log the notification ID alongside the failure).	2020-12-30 17:18:15 +00:00
Chris Hill-Scott	9825469613	Make language attributes abstract properties This will make it impossible to create a new client without at least having to define these properties. Which should get someone thinking about language support…	2020-12-24 15:19:46 +00:00
Chris Hill-Scott	c3a1d5c506	Pass language through to lambda If we’re sending non-GSM characters, we need to mark the language in the XML as Welsh (`cy-GB` in CAP, `Welsh` in IBAG). Currently, the CBC proxy checks the content we’re sending, and then uses an approximation based on ASCII to determine whether we’re sending any non-GSM characters, and if so, sets the language appropriately. Instead, we should can functionality from the notifications-utils repo to determine the language. If any non-GSM characters are used, then the we can set the language to Welsh. We’ll need to update the proxy to look at this new language flag.	2020-12-24 15:15:32 +00:00
Leo Hemsted	325f271e25	handle doc dl connection errors correctly previously we'd see an error message in the logs: `AttributeError: 'NoneType' object has no attribute 'status_code'` because we were assuming the requests exception would always have a response - it won't have a response if it wasn't able to create a connection at all.	2020-12-23 12:21:24 +00:00
Pea Tyczynska	95deb5a52f	Move DATETIME_FORMAT from app to app.utils To avoid cyclical import issues	2020-12-18 17:39:35 +00:00
Pea Tyczynska	ee833bd65b	Fix cancel broadcast by converting reference date to string Datetime oobject is not json serializable, we have to convert it to string for the created_at field of previous broadcast provider messages.	2020-12-18 17:22:21 +00:00
Pea Tyczynska	4758d8c4cb	Format message_number for references In IBAG format for broadcasts, we need to give sequential number of previous message, and it needs to be formatted as a hex padded with zeroes to be 8 character long. This commit adds the necessary formatting.	2020-12-14 18:21:28 +00:00
Pea Tyczynska	35a212d907	Add cancel routes to cbc proxy clients Also clean the code up a bit.	2020-12-11 18:52:54 +00:00
Richard Baker	4dd37acecb	Set cbc proxy message_format to cap The CBC proxy lambda expects the message_format parameter to be one of `cap` or `ibag`. Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>	2020-12-09 17:10:40 +00:00
Pea M. Tyczynska	a70b7c521e	Merge pull request #3053 from alphagov/ibag-message-number Add sequential message number to broadcast provider messages	2020-12-09 13:02:25 +00:00
Pea Tyczynska	8af4b27fd6	Separate functions for cbc clients Also move message_format to the clients.	2020-12-09 11:13:50 +00:00
Pea Tyczynska	553565bc91	Send message format to CBC Either cap or ibag	2020-12-08 11:15:26 +00:00
Leo Hemsted	9502f17d84	flake8 fixes a stricter flake8 bump. mostly things around f strings and format strings, but a couple of bad placeholder names in loops	2020-12-07 15:24:02 +00:00
Pea Tyczynska	932a09fe5b	Pass message_number to proxy clients	2020-12-07 13:13:12 +00:00
Pea Tyczynska	b34bffaae6	Sends sequential number to Vodafone as link test	2020-12-07 13:13:11 +00:00
Leo Hemsted	e2fa0116a0	add CBC_PROXY_ENABLED config flag to control if tasks are triggered previously we made some incorrect assumptions about set-up on staging and prod - they currently don't have any cbc_proxy aws creds at all. We shoudn't be attempting canaries or link tests when there's no AWS infrastructure to connect to. We also shouldn't bother writing a row into the database at all for the broadcast_provider_message since we're not even attempting to send, and we shouldn't get confused between messages that failed and messages we never wanted to send at all.	2020-11-26 10:16:22 +00:00
Leo Hemsted	087cc5053d	separate cbc proxy into separate clients this is a pretty big and convoluted refactor unfortunately. Previously: There was one global `cbc_proxy_client` object in apps. This class has the information about how to invoke the bt-ee lambda, and handles all calls to lambda. This includes calls to the canary too (which is a separate lambda). The future: There's one global `cbc_proxy_client`. This knows about the different provider functions and lambdas, and you'll need to ask this client for a proxy for your chosen provider. call cbc_proxy_client.get_proxy('ee')` and it'll return you a proxy that knows what ee's lambda function is, how to transform any content in a way that is exclusive to ee, and in future how to parse any response from ee. The present: I also cleaned up some duplicate tests. I'm really not sure about the names of some of these variables - in particular `cbc_proxy_client` isn't a client - it's more of a java style factory, where you call a function on it to get the client of your choice.	2020-11-19 15:50:37 +00:00
Leo Hemsted	0257774cfa	add get_earlier_provider_message fn to broadcast_event replacing get_earlier_provider_messages. The old function returned the previous references for earlier events for a broadcast_message. However, these depend on the message sent to a specific provider, so the function needs to change. It now takes in a provider, and only returns broadcast_provider_messages sent to that provider. If there are earlier broadcast_events without a provider_message for the chosen provider, it raises an exception - you cannot cancel a message if all the previous events have not been created properly (as we wouldn't know what references to cancel).	2020-11-19 15:50:37 +00:00
Leo Hemsted	7cc83e04eb	move BroadcastProvider from models.py to config.py It's not something that is tied to a database table, and was causing circular import issues	2020-11-19 15:50:37 +00:00
Leo Hemsted	bc3512467b	send messages to multiple providers at the moment only EE is enabled (this is set in app.config, but also, only EE have a function defined for them so even if another provider was enabled without changing the dict in cbc_proxy.py we won't trigger anything). this commit just adds wrapper tasks that check what providers are enabled, and invokes the send function for each provider. The send function doesn't currently distinguish between providers for now - as we only have EE set up. in the future we'll want to separate the cbc_proxy_client into separate clients for separate providers. Different providers have different lambda functions, and have different requirements. For example, we know that the two different CBC software solutions handle references to previous messages differently.	2020-11-19 15:50:37 +00:00
Leo Hemsted	b72640bf5e	refactor cbc proxy and fix tests moved the lambda invocation to a separate function to keep DRY asserts on exception types need to be outside of with blocks, or they won't trip (as the exception will stop execution of the inner with block). the asserts were also the wrong way round so fixed that.	2020-11-17 13:35:04 +00:00
Leo Hemsted	732c203d3e	rename clients to notification_provider_clients i think it's causing havoc with my attempts to mock stuff in the `app.clients` directory because it's also accessible at that path. the name's super vague and doesn't explain what it is anyway	2020-11-17 13:34:58 +00:00
Toby Lorne	dd012d6831	client: cbc_proxy passes through sent/expires A BroadcastEvent knows when an event was sent and should expire We pass through these values directly to the CBC Proxy, because BroadcastEvent knows how they should be formatted Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-28 11:37:06 +00:00
Toby Lorne	7542709455	clients: cbc_proxy sends message_type When we ask the CBC Proxy to send a message, we should specify that we want to send a real message, when we want a real message We will do this by specifying the message_type which can have 4 types, 3 of which represent a real message: \| Name \| Effect \| \| ------ \| ------------------------ \| \| alert \| Create an alert \| \| update \| Update an existing alert \| \| cancel \| Cancel an existing alert \| \| test \| Send a link test \| We will use message_type to represent the table above Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Richard <richard.baker@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>	2020-10-27 15:24:02 +00:00
Toby Lorne	052de84c9e	clients: cbc_proxy client has canary method The CBC Proxy is essentially a lambda function which we invoke with various arguments. A way in which this can fail is that the notifications-api app invoking the function may not be able, any longer, to invoke the function. This could be caused by, for example: * an egress restriction preventing access to eu-west-2.lambda.amazonaws.com * a network partition preventing access to eu-west-2.lambda.amazonaws.com * the app's credentials have been rotated or revoked If we invoke a simple "canary" lambda function for which the app should have access to invoke, and check it for failures, we will know quickly if something is likely to be broken. This is especially important for cell broadcasts compared to email/SMS because we always have a baseline of traffic for email/SMS, and so any failure is observed almost immediately. This is not true for CB where we may expect to only see one CB message every week/month/quarter/year, as opposed to every minute or second for email/SMS. Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk> Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>	2020-10-26 17:14:08 +00:00
Toby Lorne	aa002afd31	clients: cbc_proxy actions accepts areas param related: https://github.com/alphagov/notifications-broadcasts-infra/pull/23 Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-23 17:09:00 +01:00
Toby Lorne	ff1ffc7fba	clients: cbc_proxy lambda client is unabbreviated for code clarity Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-22 12:22:11 +01:00
Toby Lorne	adc2ce8283	clients: cbc_proxy has clarifying comments Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>	2020-10-22 12:19:25 +01:00

... 2 3 4 5 6 ...

315 Commits