Commit Graph

90 Commits

Author SHA1 Message Date
Kenneth Kehl
1ecb747c6d reformat 2023-08-29 14:54:30 -07:00
Carlo Costino
d4848a67b5 Switch to using FIPS-enabled endpoints
This changeset switches AWS service touchpoints to use their FIPS-enabled counterparts.  Note that S3 has some specific configuration associated with it.

This changeset also updates our allow ACLs to cover the FIPS-enabled endpoints.  We should investigate removing the non-FIPS endpoints as a part of this.

Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>
2023-08-11 16:24:45 -04:00
Kenneth Kehl
001954538e notify-243 remove statsd 2023-04-25 07:50:56 -07:00
Ryan Ahearn
40ec79e74c Only use service sender value if it is valid for SNS OriginationNumber 2023-03-03 15:40:21 -05:00
Ryan Ahearn
28f8649444 Use sns credentials from VCAP_SERVICES 2023-02-28 16:50:00 -05:00
stvnrlly
e9fdfd59f4 clean flake8 except provider code 2022-10-19 16:16:26 +00:00
Christa Hartsock
af6495cd4c Get tests passing locally
When we cloned the repository and started making modifications, we
didn't initially keep tests in step. This commit tries to get us to a
clean test run by skipping tests that are failing and removing some
that we no longer expect to use (MMG, Firetext), with the intention that
we will come back in future and update or remove them as appropriate.

To find all tests skipped, search for `@pytest.mark.skip(reason="Needs
updating for TTS:`. There will be a brief description of the work that
needs to be done to get them passing, if known. Delete that line to make
them run in a standard test run (`make test`).
2022-07-07 15:41:15 -07:00
Jim Moffet
2bcae20441 initial sms provider cleanup 2022-06-25 13:05:10 -07:00
Jim Moffet
aa4ec532a4 implement SNS 2022-06-17 11:16:23 -07:00
Ben Thorner
c27107fa74 Remove support for Reach provider
This provider was never active and support was never completed, so
there's little value in keeping all this potentially confusing code.
2022-04-29 12:28:08 +01:00
Ben Thorner
44d90b0a4f Remove redundant ternary on SMS client FROM_NUMBER
Logs over the past 14 days confirm we never call this code with
None as the sender, so it's safe to remove the ternary.
2022-04-12 14:59:21 +01:00
Ben Thorner
0d07220923 Fix log for sending SMS to be generic per client
This was a mistake in [^1].

[^1]: 3b082477f0 (diff-95316b574f974237b3b7ce453fec09628bd1062087154789ff4d0176ea23c460R52)
2022-04-05 10:46:14 +01:00
Ben Thorner
e6fffc00da Add temporary log to check if code is in use
In response to: [^1].

[^1]: https://github.com/alphagov/notifications-api/pull/3493#discussion_r838477599
2022-03-31 10:32:18 +01:00
Ben Thorner
7d92a0869a Remove per-client SMS exception classes
In response to: [^1].

The stacktrace conveys the same and more information. We don't do
anything different for each exception class, so there's no value
in having three of them over one exception.

I did think about DRYing-up the duplicate exception behaviour into
the base class one. This isn't ideal because the base class would
be making assumptions about how inheriting classes make requests,
which might change with future providers. Although it might be nice
to have more info in the top-level message, we'll still get it in
the stacktrace e.g.

    ValueError: Expected 'code' to be '0'
    During handling of the above exception, another exception occurred:
    app.clients.sms.SmsClientResponseException: SMS client error (Invalid response JSON)

    requests.exceptions.ReadTimeout
    During handling of the above exception, another exception occurred:
    app.clients.sms.SmsClientResponseException: SMS client error (Request failed)

[^1]: https://github.com/alphagov/notifications-api/pull/3493#discussion_r837363717
2022-03-30 13:38:50 +01:00
Ben Thorner
a2e1d03009 Require "sender" argument to send_sms method
In response to [^1].

[^1]: https://github.com/alphagov/notifications-api/pull/3493#discussion_r836616675
2022-03-30 13:38:48 +01:00
Ben Thorner
015152bab2 Add boilerplate for sending SMS via Reach
This works in conjunction with the new SMS provider stub [^1].

Local testing:

- Run the migrations to add Reach as an inactive provider.
- Activate the Reach provider locally and deactivate the others.

      update provider_details set priority = 100, active = false where notification_type = 'sms';
      update provider_details set active = true where identifier = 'reach';

- Tweak your local environment to point at the SMS stub.

      export REACH_URL="http://host.docker.internal:6300/reach"

- Start / restart Celery to pick up the config change.
- Send a SMS via the Admin app and see the stub log it.
- Reset your environment so you can send normal SMS.

      update provider_details set active = true where notification_type = 'sms';
      update provider_details set active = false where identifier = 'reach';

[^1]: https://github.com/alphagov/notifications-sms-provider-stub/pull/10
2022-03-30 13:38:46 +01:00
Ben Thorner
27ddc4501e DRY-up overriding shortcode with sender
This avoids duplicating the logic when we add a new provider.
2022-03-30 13:37:01 +01:00
Ben Thorner
3b082477f0 DRY-up logging and metrics for sending SMS
This avoids duplicating it as we add a new provider and means we
can test it all in one place (although it wasn't tested before).

I'm not sure why the previous code did "super(..)__init__" in a
non-init function - it's a bit late! - so I've just replaced it
with a call to the new "init_app" function in the parent class.
2022-03-30 13:37:00 +01:00
Ben Thorner
22e055f4d1 DRY-up recording the outcome of SMS sending
This reduces the code to copy when we add a new provider. I don't
think we need to log the URL or status code each time:

- The URL is always the same.
- A "200" status code is implicit in "success".
- Other status codes will be reported as exceptions.

Removing these specific elements means "record_outcome" is generic
and can be de-duplicated in the base class.
2022-03-30 13:36:58 +01:00
Ben Thorner
35f710bdf3 Remove redundant "multi" parameter for MMG client
This is never overridden and can't be used in practie because all
SMS clients have to use the same interface. Removing it will make
it possible to DRY-up some of the code in this method.
2022-03-30 13:36:57 +01:00
Ben Thorner
e6e16a81d0 Simplify getting name of email / sms providers
Previously we used a combination of "provider.name" and "get_name()"
which was confusing. Using a non-property function also gave me the
impression that the name was more dynamic than it actually is.
2022-03-30 13:36:55 +01:00
Ben Thorner
b439fd0718 Add boilerplate for Reach SMS callbacks
This is enough to update a notification in DB:

1. First create a notification in the UI and sent it.

2. Then reset its attributes to pretend it's for Reach.

    update notifications set
      sent_at = null,
      sent_by = null,
      notification_status='sending'
    where id='some-uuid';

3. Change "notification_id" to "<some-uuid>" in the code.

4. Call the boilerplate endpoint for Reach callbacks.

    curl -X POST localhost:6011/notifications/sms/reach

Interestingly there's no foreign key constraint on "sent_by" in the
DB, so this just works: the notification is updated.
2022-03-24 16:56:33 +00:00
Ben Thorner
3eeba0266b Revert "add raw request timings to provider send functions"
This reverts commit f2f2509c9b.
Raw request stats were added to investigate a hunch about a
performance issue we were seeing [1], but turned out not to
be relevant. We don't use them anymore so we can tidy up.

[1]: https://github.com/alphagov/notifications-api/pull/2858
2021-10-28 11:12:18 +01:00
Rebecca Law
f3fdd3b09b Add internation api key for firetext.
We want to start using Firetext for sending international SMS. They
require us to use a different API key for international SMS because it
requires a new code path to switch the sender ID to something that the
country will accept.
This PR does not include switching the sender of international SMS to
Firetext but sets us up to do so.
2021-04-20 13:58:55 +01:00
Ben Thorner
a91fde2fda Run auto-correct on app/ and tests/ 2021-03-12 11:45:45 +00:00
Rebecca Law
19f7a6ce38 Refactor method for deciding the failure type 2021-03-10 14:39:55 +00:00
David McDonald
ac6837cde5 Downgrade exception to warning for provider API call
When we send an HTTP request to our SMS providers, there is a
chance we get a 5xx status code back from them. Currently we log this as
two different exception level logs.

If a provider has a funny few minutes, we could end up with
hundreds of exceptions thrown and pagerduty waking someone up in the
middle of the night. These problems tend to pretty quickly fix
themselves as we balance traffic from one SMS to the other SMS provider
within 5 minutes.

By downgrading both exceptions to warning in the case of a
`SmsClientResponseException`, we will reduce the change of waking us up
in the middle of the night for no reason.

If the error is not a `SmsClientResponseException`, then we will still
log at the exception level as before as this is more unexpected and we
may want to be alerted sooner.

What we still want to happen though is that let's say both SMS providers
went down at the same time for 1 hour. We don't want our tasks to just
sit there, retrying every 5 minutes for the whole time without us being
aware (so we can at least raise a statuspage update). Luckily we will
still be alerted because our smoke tests will fail after 10 minutes and
raise a p1:
https://github.com/alphagov/notifications-functional-tests/blob/master/tests/functional/staging_and_prod/notify_api/test_notify_api_sms.py#L21
2021-01-18 17:00:21 +00:00
Pea M. Tyczynska
9f816ad5f5 Merge pull request #2856 from alphagov/mmg_detailed_response
Capture detailed delivery receipt status from MMG
2020-06-01 15:23:53 +01:00
Pea Tyczynska
c96142ba5e Change function and variable names for readability and consistency 2020-06-01 12:44:49 +01:00
Pea Tyczynska
a4b942cf6c Log detailed sms delivery status for mmg from process_sms_client_response task.
Also log detailed delivery status for firetext in the same place in addition
to it being logged from notifications_dao.

Logging detailed delivery statuses will help us see why messages
fail to deliver. In the future we could persist detailed delivery
status in the database.
2020-06-01 12:44:49 +01:00
Leo Hemsted
f2f2509c9b add raw request timings to provider send functions
we're using statsd to monitor how long provider requests are taking.
However, there's lots of busy work that happens inside our statsd
metrics timing window. Things like json dumping and loading, building
headers, exception handling, etc.

for firetext/mmg, the response object from requests has an elapsed
property [1], which captures from sending raw data to parsing the
response headers. for ses, it's a bit trickier, but boto3 exposes a few
event hooks [2]. it's hard to find them without stepping through the
code, but the interesting ones are before-call, after-call,
after-call-error, request-created, and response-received. The
before-call and after-call involve some marshalling, built-in retrying,
etc, while request-created and response-received are much lower level.
They might be called more than once per ses request, if boto3 itself
retries the request on 5xx, 429 and low level socket errors [3].

Add these as new `raw-request-time` metrics rather than overwriting to
avoid changing the meaning of an existing metric, and to let us compare
the metrics to see if there's a noticeable difference at all

[1] https://requests.readthedocs.io/en/master/api/#requests.Response.elapsed
[2] https://boto3.amazonaws.com/v1/documentation/api/latest/guide/events.html
[3] https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#legacy-retry-mode
2020-05-29 14:04:46 +01:00
Pea Tyczynska
9782cbc5b5 Check failure code on failure only.
We should check error code on failure only, because if we get an
error code on pending code, we should still set the notification
to pending status.
2020-04-21 13:43:45 +01:00
Pea Tyczynska
b962dcac5d Check Firetext code first to determine status and log the reason
If there is no code, fall back to old way of determining status.
If the code is not recognised, log an Error so we are notified
and can look into it.
2020-04-20 18:33:19 +01:00
Pea Tyczynska
91fe68eed4 WIP read firetext response codes 2020-04-17 10:58:25 +01:00
Leo Hemsted
e094dd4bfd remove loadtesting from providers
we don't use it since we wrote our own provider stubs for performance
tests.

this removes it from the api - it's still in the DB and will be
retrieved by queries, but is set to disabled on prod
2019-10-23 11:45:07 +01:00
Alexey Bezhan
330afab5e2 Make Firetext URL configurable through the application environment
Similar to MMG, there's a new env variable FIRETEXT_URL that can be
used to override the Firetext api URL.

This will be used to stub out both providers during the load test or
can be used to run a local API against a fake provider endpoint.
2019-04-12 12:03:58 +01:00
Leo Hemsted
267c4fc07b bump requirements, fix pyflake8 things, unpin botocore/awscli 2018-11-07 13:39:08 +00:00
Chris Hill-Scott
cefa253578 Remove monotonic
> On Python 3.3 or newer, monotonic will be an alias of time.monotonic
> from the standard library. On older versions, it will fall back to an
> equivalent implementation.

– https://pypi.org/project/monotonic/
2018-05-04 10:56:51 +01:00
Rebecca Law
b2dfa59b1b We are not handling the case of an unknown status code sent by the SMS provider. This PR attempts to fix that.
- If the SMS client sends a status code that we do not recognize raise a ClientException and set the notification status to technical-failure
- Simplified the code in process_client_response, using a simple map.
2018-02-07 17:49:15 +00:00
Martyn Inglis
72ecca4dab Added a 60 second timeout to the MMG and fire text clients.
- This is a CONNECT and a READ timeout.
- Gets wrapped in the standard client exception, with a status code of 504, message Gateway timeout.
- Is quiet noisy in logs to allow us to see it
- Ensures we flick across the provider.

To test change the timeout to 0 and it will timeout.
2017-04-05 15:31:37 +01:00
Martyn Inglis
57bd6494e1 WIP - adding request timeout 2017-04-04 16:05:25 +01:00
Martyn Inglis
2d946736e0 Log notification ID on deliver tasks and in clients.
- help tie things together  in Kibana.
2016-12-20 13:24:08 +00:00
Martyn Inglis
6f7dd149e2 Tidied up pull request comments 2016-09-23 15:59:23 +01:00
Martyn Inglis
124487f204 Fix from number on Load testing client 2016-09-23 10:31:18 +01:00
Martyn Inglis
54c28a4bea Updated logging as logger was parsing some JSON and erroring.
- Don't write whole JSON response. Safer not to anyway. No need.
2016-09-23 10:24:12 +01:00
Martyn Inglis
376f8355cb Updated clients to have a more robust error handling
- fire text and omg much more similar. Ready to be combined.
- Error handling now for JSON valid responses
2016-09-22 17:18:05 +01:00
Rebecca Law
3d41fa9634 Update mmg status code 2 to be a permanent failure. 2016-09-01 10:23:37 +01:00
Leo Hemsted
26d7675baa pep8 fixes
no idea why the build/local pep8s weren't picking them up before.

also excluded import order pep8
2016-08-23 12:05:47 +01:00
Martyn Inglis
fe54fa9f73 Final pass through existing statsd endpoints to ensure they match new naming strategy.
Updates accordingly.
2016-08-08 11:23:58 +01:00
Rebecca Law
c8ad5362eb Moved mmg_url to configs.
Fix up the tests
2016-07-20 10:40:50 +01:00