Commit Graph

107 Commits

Author SHA1 Message Date
Ben Thorner
44d90b0a4f Remove redundant ternary on SMS client FROM_NUMBER
Logs over the past 14 days confirm we never call this code with
None as the sender, so it's safe to remove the ternary.
2022-04-12 14:59:21 +01:00
Ben Thorner
7d92a0869a Remove per-client SMS exception classes
In response to: [^1].

The stacktrace conveys the same and more information. We don't do
anything different for each exception class, so there's no value
in having three of them over one exception.

I did think about DRYing-up the duplicate exception behaviour into
the base class one. This isn't ideal because the base class would
be making assumptions about how inheriting classes make requests,
which might change with future providers. Although it might be nice
to have more info in the top-level message, we'll still get it in
the stacktrace e.g.

    ValueError: Expected 'code' to be '0'
    During handling of the above exception, another exception occurred:
    app.clients.sms.SmsClientResponseException: SMS client error (Invalid response JSON)

    requests.exceptions.ReadTimeout
    During handling of the above exception, another exception occurred:
    app.clients.sms.SmsClientResponseException: SMS client error (Request failed)

[^1]: https://github.com/alphagov/notifications-api/pull/3493#discussion_r837363717
2022-03-30 13:38:50 +01:00
Ben Thorner
8432be4fc1 Add missing test for invalid Firetext JSON
This is already tested for MMG (and Reach).
2022-03-30 13:38:49 +01:00
Ben Thorner
a2e1d03009 Require "sender" argument to send_sms method
In response to [^1].

[^1]: https://github.com/alphagov/notifications-api/pull/3493#discussion_r836616675
2022-03-30 13:38:48 +01:00
Ben Thorner
015152bab2 Add boilerplate for sending SMS via Reach
This works in conjunction with the new SMS provider stub [^1].

Local testing:

- Run the migrations to add Reach as an inactive provider.
- Activate the Reach provider locally and deactivate the others.

      update provider_details set priority = 100, active = false where notification_type = 'sms';
      update provider_details set active = true where identifier = 'reach';

- Tweak your local environment to point at the SMS stub.

      export REACH_URL="http://host.docker.internal:6300/reach"

- Start / restart Celery to pick up the config change.
- Send a SMS via the Admin app and see the stub log it.
- Reset your environment so you can send normal SMS.

      update provider_details set active = true where notification_type = 'sms';
      update provider_details set active = false where identifier = 'reach';

[^1]: https://github.com/alphagov/notifications-sms-provider-stub/pull/10
2022-03-30 13:38:46 +01:00
Ben Thorner
27ddc4501e DRY-up overriding shortcode with sender
This avoids duplicating the logic when we add a new provider.
2022-03-30 13:37:01 +01:00
Ben Thorner
3b082477f0 DRY-up logging and metrics for sending SMS
This avoids duplicating it as we add a new provider and means we
can test it all in one place (although it wasn't tested before).

I'm not sure why the previous code did "super(..)__init__" in a
non-init function - it's a bit late! - so I've just replaced it
with a call to the new "init_app" function in the parent class.
2022-03-30 13:37:00 +01:00
Ben Thorner
b439fd0718 Add boilerplate for Reach SMS callbacks
This is enough to update a notification in DB:

1. First create a notification in the UI and sent it.

2. Then reset its attributes to pretend it's for Reach.

    update notifications set
      sent_at = null,
      sent_by = null,
      notification_status='sending'
    where id='some-uuid';

3. Change "notification_id" to "<some-uuid>" in the code.

4. Call the boilerplate endpoint for Reach callbacks.

    curl -X POST localhost:6011/notifications/sms/reach

Interestingly there's no foreign key constraint on "sent_by" in the
DB, so this just works: the notification is updated.
2022-03-24 16:56:33 +00:00
Ben Thorner
cdc150de1b Change link test task to trigger both lambda
This modifies the previous "(_)send_link_test" method to trigger a
link test for a specific lambda. We then call the method with both
the primary and failover lambda in new orchestrator method.

Since the _invoke_lambda function doesn't raise exceptions if it
fails, there's no need to rescue anything in order to ensure the
second link test / invocation runs as well. It doesn't testing for
this, since it boils to an absence of code to raise any exception.

Note that, like the other parent tests, we only check the new method
works with a specific proxy client instance.
2021-07-19 16:00:56 +01:00
Ben Thorner
08f48379b4 Move ID generation into link test method
Unlike the other IDs which are stored in the DB, this isn't relevant
for the Celery task as it invokes a link test. Moving it into the
proxy client will also enable us to generate a second ID in the next
commits, where we start doing a link test for the failover lambda.
2021-07-19 16:00:55 +01:00
Ben Thorner
b6774bf0f7 Generate Vodafone link test sequence nos in proxy
Previously the Celery task to trigger a link test had to know about
the special case of a sequence number for Vodafone. Since we're about
to change the client to perform multiple tests it makes sense to give
it the knowledge of how to generate number itself.

Note that we have to import the db inline to avoid a circular import,
since this module is itself imported by app/__init__.py.

Other invocations of the Vodafone client use stored sequence numbers
from the DB, which are called "message numbers" in that context. Since
the two use cases are very different (even the names are different!),
having them in two places shouldn't cause any confusion.
2021-07-19 15:43:36 +01:00
Rebecca Law
f3fdd3b09b Add internation api key for firetext.
We want to start using Firetext for sending international SMS. They
require us to use a different API key for international SMS because it
requires a new code path to switch the sender ID to something that the
country will accept.
This PR does not include switching the sender of international SMS to
Firetext but sets us up to do so.
2021-04-20 13:58:55 +01:00
David McDonald
6d410daae4 Remove the emergency alerts canary
See https://github.com/alphagov/notifications-broadcasts-infra/pull/197
for why we no longer need this and we get to delete some code!
2021-03-26 18:31:53 +00:00
Ben Thorner
a91fde2fda Run auto-correct on app/ and tests/ 2021-03-12 11:45:45 +00:00
Rebecca Law
19f7a6ce38 Refactor method for deciding the failure type 2021-03-10 14:39:55 +00:00
Leo Hemsted
90e82aff3e properly log the lambda response correctly
boto returns a `StreamingBody`[1] response rather than a json struct.
We're currently just logging things like "Error calling lambda
o2-1-proxy with function error <botocore.response.StreamingBody object
at 0x7f74cd6e02e8>" which is obviously less than ideal. Also make the
tests properly reflect this - annoyingly it appears like we can't use
moto to reliably test this interface as the moto `mock_lambda` decorator
needs you to be running inside a docker container??

[1] https://botocore.amazonaws.com/v1/documentation/api/latest/reference/response.html#botocore.response.StreamingBody
2021-02-18 11:51:38 +00:00
Katie Smith
6b8ebb3421 Fix linting errors 2021-02-16 09:03:38 +00:00
Leo Hemsted
62cf9f60a9 catch boto exceptions
these will happen if, for example, you have issues connecting to AWS or
permission issues.

Still failover if we get one of these exceptions, as I think it might be
possible to have a problem only related to one of the lambdas.
2021-02-12 19:48:32 +00:00
Leo Hemsted
4f89be6944 Revert "Merge pull request #3125 from alphagov/revert-retry"
This reverts commit 6b9a50beff, reversing
changes made to 33f93dfea2.
2021-02-09 17:01:04 +00:00
Leo Hemsted
bee0059e53 Revert "Merge pull request #3101 from alphagov/retry-broadcasts"
This reverts commit 1bd99c779d, reversing
changes made to d390eb2cac.
2021-02-08 11:02:34 +00:00
Leo Hemsted
ac34fb9c05 retry sending broadcasts
Retry tasks if they fail to send a broadcast event. Note that each task
tries the regular proxy and the failover proxy for that provider. This
runs a bit differently than our other retries:

Retry with exponential backoff. Our other tasks retry with a fixed delay
of 5 minutes between tries. If we can't send a broadcast, we want to try
immediately. So instead, implement an exponential backoff (1, 2, 4, 8,
... seconds delay). We can't delay for longer than 310 seconds due to
visibility timeout settings in SQS, so cap the delay at that amount.

Normally we give up retrying after a set amount of retries (often 4
hours). As broadcast content is much more important than normal
notifications, we don't ever want to give up on sending them to phones...

...UNLESS WE DO!

Sometimes we do want to give up sending a broadcast though! Broadcasts
have an expiry time, when they stop showing up on peoples devices, so if
that has passed then we don't need to send the broadcast out.

Broadcast events can also be superceded by updates or cancels. Check
that the event is the most recent event for that broadcast message, if
not, give up, as we don't want to accidentally send out two conflicting
events for the same message.
2021-02-03 16:43:01 +00:00
David McDonald
2aad3163e6 Allow CBC proxy client to take channel
This moves the hardcoding to test channels one step up to where we call
`create_and_send_broadcast`

We can then after this, start to differ whether we give it the 'test' or
'severe' channel based on the services channel setting.
2021-02-01 14:10:38 +00:00
David McDonald
a3d966056a Merge pull request #3110 from alphagov/test-channel
Set the default broadcast channel to test
2021-02-01 10:18:35 +00:00
David McDonald
f9b1d3d573 Set the default broadcast channel to test
This used to be hardcoded in the CBC proxy but now we will hardcode it
in the cbc_proxy_client.

In a future PR we can start choosing which channel a broadcast will go
to based on the channel configured for that broadcast service.
2021-01-27 15:27:11 +00:00
Katie Smith
2681752f15 Rename bt-ee-proxy to ee-proxy
We want to rename the `bt-ee-1-proxy` lambda function to `ee-1-proxy`.
This change will need to be deployed at the same time that we change
the name of the lambda function in the Terraform.
2021-01-26 14:36:20 +00:00
Richard Baker
6256cdf792 Add proxy client for o2 cell croadcasting
o2 use One-2-many CBC so we can use the O2M/CAP client.

Once differences between CBCs have been worked out we can consolidate O2M clients to reduce duplication.

Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>
2021-01-26 11:11:44 +00:00
David McDonald
ff193387d1 Add proxy client for Three
Three uses the One 2 Many technology so should work in the same way as
our proxy for EE
2021-01-14 11:44:46 +00:00
Pea Tyczynska
b5a33ded98 Retry with failover lambda for FunctionError and status > 299
For all FunctionErrors, and for invoke errors (status > 299) we
want to retry with failover lambda.

We are doing this, because if there is a connection or other error
with one lambda, the failover lambda may still work and it's
worth trying.

With time, we will probably have more complex retry flow, depending
on the error and even maybe differing for each MNO (broadcast provider).
2021-01-14 10:45:29 +00:00
Pea Tyczynska
d7661abe81 Move variable used in tests to top of file for more DRY code 2021-01-12 15:34:47 +00:00
David McDonald
24f52721f3 Retry with second lambda if connection error
Note, we assume whenever there is a `FunctionError` that there will be a
payload that contains an `errorMessage` key. It's implied implicitely in
the docs but it's not very explicit.

https://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html#API_Invoke_ResponseSyntax
2021-01-12 15:34:47 +00:00
David McDonald
57f5bd76de Merge pull request #3081 from alphagov/ses-error-logs
SES error logs
2020-12-31 13:13:20 +00:00
Leo Hemsted
d470c928cd Merge pull request #3072 from alphagov/doc-dl-exc
handle doc dl connection errors correctly
2020-12-31 11:24:00 +00:00
David McDonald
2480f91667 Raise better exception on InvalidParameterValue error
There are several reasons why we might get an `InvalidParameterValue`
from the SES API. One, as correctly identified before in
https://github.com/alphagov/notifications-api/pull/713/files
is if we allow an email address on our side that SES rejects.

However, there are other types of errors that could cause an
`InvalidParameterValue`. One example is a `Header too long: 'Subject'`
error that we have seen happen in production. This shouldn't raise an
`InvalidEmailError` as that is not appropriate.

Therefore, we introduce a new exception
`EmailClientNonRetryableException`, that represents any exception back
from an email client that we can use whenever we get a
`InvalidParameterValue` error.

Note, I chose `EmailClientNonRetryableException` rather than
`SESClientNonRetryableException` as our code needs to catch this
exception and it shouldn't be aware of what email client is being used,
it just needs to know that it came from one of the email clients (if in
time we have more than one).

In time, we may wish to extend the approach of having generic
`EmailClient` exceptions and `SMSClient` exceptions as this should be
the most extendable pattern and a good abstraction.
2020-12-30 17:18:16 +00:00
David McDonald
2079202160 Stop logging email addresses for SES errors
We shouldn't be logging PII so we should not log email addresses. We
remove the email address and just log the normal exception message.

Note, this meant before that you could see the email address and more
easily track down the notification ID in the database. Now instead, you
will need to search in the DB for notifications that have gone into
technical failure at the time of the log message (as we still don't
log the notification ID alongside the failure).
2020-12-30 17:18:15 +00:00
Chris Hill-Scott
c3a1d5c506 Pass language through to lambda
If we’re sending non-GSM characters, we need to mark the language in the
XML as Welsh (`cy-GB` in CAP, `Welsh` in IBAG).

Currently, the CBC proxy checks the content we’re sending, and then uses
an approximation based on ASCII to determine whether we’re sending any
non-GSM characters, and if so, sets the language appropriately.

Instead, we should can functionality from the notifications-utils repo
to determine the language. If any non-GSM characters are used, then the
we can set the language to Welsh.

We’ll need to update the proxy to look at this new language flag.
2020-12-24 15:15:32 +00:00
Leo Hemsted
325f271e25 handle doc dl connection errors correctly
previously we'd see an error message in the logs:
`AttributeError: 'NoneType' object has no attribute 'status_code'`
because we were assuming the requests exception would always have a
response - it won't have a response if it wasn't able to create a
connection at all.
2020-12-23 12:21:24 +00:00
Pea Tyczynska
95deb5a52f Move DATETIME_FORMAT from app to app.utils
To avoid cyclical import issues
2020-12-18 17:39:35 +00:00
Pea Tyczynska
ee833bd65b Fix cancel broadcast by converting reference date to string
Datetime oobject is not json serializable, we have to convert
it to string for the created_at field of previous broadcast
provider messages.
2020-12-18 17:22:21 +00:00
Pea Tyczynska
4758d8c4cb Format message_number for references
In IBAG format for broadcasts, we need to give sequential number
of previous message, and it needs to be formatted as a hex padded
with zeroes to be 8 character long.

This commit adds the necessary formatting.
2020-12-14 18:21:28 +00:00
Pea Tyczynska
35a212d907 Add cancel routes to cbc proxy clients
Also clean the code up a bit.
2020-12-11 18:52:54 +00:00
Richard Baker
4dd37acecb Set cbc proxy message_format to cap
The CBC proxy lambda expects the message_format parameter to be one of `cap` or `ibag`.

Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>
2020-12-09 17:10:40 +00:00
Pea Tyczynska
8af4b27fd6 Separate functions for cbc clients
Also move message_format to the clients.
2020-12-09 11:13:50 +00:00
Pea Tyczynska
553565bc91 Send message format to CBC
Either cap or ibag
2020-12-08 11:15:26 +00:00
Pea Tyczynska
932a09fe5b Pass message_number to proxy clients 2020-12-07 13:13:12 +00:00
Leo Hemsted
e2fa0116a0 add CBC_PROXY_ENABLED config flag to control if tasks are triggered
previously we made some incorrect assumptions about set-up on staging
and prod - they currently don't have any cbc_proxy aws creds at all.

We shoudn't be attempting canaries or link tests when there's no AWS
infrastructure to connect to.

We also shouldn't bother writing a row into the database at all for the
broadcast_provider_message since we're not even attempting to send, and
we shouldn't get confused between messages that failed and messages we
never wanted to send at all.
2020-11-26 10:16:22 +00:00
Leo Hemsted
087cc5053d separate cbc proxy into separate clients
this is a pretty big and convoluted refactor unfortunately.

Previously:

There was one global `cbc_proxy_client` object in apps. This class has
the information about how to invoke the bt-ee lambda, and handles all
calls to lambda. This includes calls to the canary too (which is a
separate lambda).

The future:

There's one global `cbc_proxy_client`. This knows about the different
provider functions and lambdas, and you'll need to ask this client for a
proxy for your chosen provider. call cbc_proxy_client.get_proxy('ee')`
and it'll return you a proxy that knows what ee's lambda function is,
how to transform any content in a way that is exclusive to ee, and in
future how to parse any response from ee.

The present:

I also cleaned up some duplicate tests.
I'm really not sure about the names of some of these variables - in
particular `cbc_proxy_client` isn't a client - it's more of a java style
factory, where you call a function on it to get the client of your
choice.
2020-11-19 15:50:37 +00:00
Leo Hemsted
b72640bf5e refactor cbc proxy and fix tests
moved the lambda invocation to a separate function to keep DRY

asserts on exception types need to be outside of with blocks, or they
won't trip (as the exception will stop execution of the inner with
block). the asserts were also the wrong way round so fixed that.
2020-11-17 13:35:04 +00:00
Toby Lorne
dd012d6831 client: cbc_proxy passes through sent/expires
A BroadcastEvent knows when an event was sent and should expire

We pass through these values directly to the CBC Proxy, because
BroadcastEvent knows how they should be formatted

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
2020-10-28 11:37:06 +00:00
Toby Lorne
7542709455 clients: cbc_proxy sends message_type
When we ask the CBC Proxy to send a message, we should specify that we
want to send a real message, when we want a real message

We will do this by specifying the message_type which can have 4 types, 3
of which represent a real message:

| Name   | Effect                   |
| ------ | ------------------------ |
| alert  | Create an alert          |
| update | Update an existing alert |
| cancel | Cancel an existing alert |
| test   | Send a link test         |

We will use message_type to represent the table above

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
Co-authored-by: Richard <richard.baker@digital.cabinet-office.gov.uk>
Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>
2020-10-27 15:24:02 +00:00
Toby Lorne
052de84c9e clients: cbc_proxy client has canary method
The CBC Proxy is essentially a lambda function which we invoke with
various arguments.

A way in which this can fail is that the notifications-api app invoking
the function may not be able, any longer, to invoke the function.

This could be caused by, for example:
* an egress restriction preventing access to eu-west-2.lambda.amazonaws.com
* a network partition preventing access to eu-west-2.lambda.amazonaws.com
* the app's credentials have been rotated or revoked

If we invoke a simple "canary" lambda function for which the app should
have access to invoke, and check it for failures, we will know quickly
if something is likely to be broken.

This is especially important for cell broadcasts compared to email/SMS
because we always have a baseline of traffic for email/SMS, and so any
failure is observed almost immediately. This is not true for CB where we
may expect to only see one CB message every week/month/quarter/year, as
opposed to every minute or second for email/SMS.

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>
2020-10-26 17:14:08 +00:00