Commit Graph

7880 Commits

Author SHA1 Message Date
David McDonald
6340fed02a Merge pull request #3047 from alphagov/sending-times-broken-down
Tweak sending time metrics to only include live notifications
2020-11-30 15:31:11 +00:00
David McDonald
988c3a335a Remove old metric for delivery sending times
We no longer need a metric that covers both test and live keys as this
is not useful
2020-11-30 15:12:13 +00:00
David McDonald
b1336c97a4 Tweak sending time metrics to only include live notifications
Changes the high volume and not high volume metrics to both only include
non test notifications. This is because when looking at the grafana
metrics, it was impossible to tell what affect the high volume/non high
volume effect was having vs the test/live notification effect.

This leaves us with no break down of high volume/not high volume sending
times for test notifications but I don't think we really need that.
2020-11-30 15:05:40 +00:00
David McDonald
b8b99b607b Merge pull request #3046 from alphagov/sending-times-broken-down
Break down how long it takes to send a notification
2020-11-30 13:35:17 +00:00
David McDonald
2110bc3eef Break down how long it takes to send a notification
We currently measure the sending time for all. This commit then breaks
it down into
- test keys and non test keys
- high volume services and non high volume services

Breaking it down into test keys and non test keys is important because
we don't care as much about sending test notifications within 10
seconds, only non test keys so we don't want our graphs to reflect poor
performance if it's just test keys affecting this

Breaking it down into high volume and non high volume will allow us to
easily debug issues with slow sending if they are high volume or non
high volume issues
2020-11-30 12:07:32 +00:00
Chris Hill-Scott
09a26ea4a9 Merge pull request #3045 from alphagov/bump-utils-43.5.2
Bump utils to 43.5.2
2020-11-30 11:31:47 +00:00
Chris Hill-Scott
a497788740 Bump utils to 43.5.2
Brings in:
- [x] https://github.com/alphagov/notifications-utils/pull/808/files

Changes:
- https://github.com/alphagov/notifications-utils/compare/43.5.1...43.5.2
2020-11-30 10:52:48 +00:00
Leo Hemsted
a6426e7785 Merge pull request #3044 from alphagov/cbc-proxy-enabled-flag
add CBC_PROXY_ENABLED config flag to control if tasks are triggered
2020-11-26 11:56:41 +00:00
Leo Hemsted
e2fa0116a0 add CBC_PROXY_ENABLED config flag to control if tasks are triggered
previously we made some incorrect assumptions about set-up on staging
and prod - they currently don't have any cbc_proxy aws creds at all.

We shoudn't be attempting canaries or link tests when there's no AWS
infrastructure to connect to.

We also shouldn't bother writing a row into the database at all for the
broadcast_provider_message since we're not even attempting to send, and
we shouldn't get confused between messages that failed and messages we
never wanted to send at all.
2020-11-26 10:16:22 +00:00
Leo Hemsted
54fecf2182 Merge pull request #3035 from alphagov/broadcast-event-response
Send broadcast events per provider
2020-11-25 10:16:30 +00:00
David McDonald
928b62e079 Merge pull request #3042 from alphagov/revert-3041-turn-on-sms-email-stubs-staging
Revert "Turn on SMS and email stubs on staging"
2020-11-24 14:30:40 +00:00
David McDonald
09759e1fe4 Revert "Turn on SMS and email stubs on staging" 2020-11-24 12:12:53 +00:00
Pea M. Tyczynska
fefcb009d5 Merge pull request #3041 from alphagov/turn-on-sms-email-stubs-staging
Turn on SMS and email stubs on staging
2020-11-23 12:02:15 +00:00
Pea Tyczynska
d57b99e307 Turn on SMS and email stubs on staging
This is done because we will be load testing on staging.
2020-11-23 11:49:16 +00:00
David McDonald
665f6dcaed Merge pull request #3040 from alphagov/ses-bounce-reason
Add notification ID to SES bounce reason
2020-11-23 09:45:44 +00:00
David McDonald
43f1f48093 Add notification ID to SES bounce reason
At the moment we log everytime we get a bounce from SES, however we
don't link it to a particular notification so it's hard to know for what
sub reason a notifcation did not deliver by looking at the logs.

This commit changes this by now looking the bounce reason after we have
found the notification ID and including them together. So if you know
search for a notification ID in Kibana, you will see full logs for why
it failed to deliver.
2020-11-20 14:10:13 +00:00
Leo Hemsted
087cc5053d separate cbc proxy into separate clients
this is a pretty big and convoluted refactor unfortunately.

Previously:

There was one global `cbc_proxy_client` object in apps. This class has
the information about how to invoke the bt-ee lambda, and handles all
calls to lambda. This includes calls to the canary too (which is a
separate lambda).

The future:

There's one global `cbc_proxy_client`. This knows about the different
provider functions and lambdas, and you'll need to ask this client for a
proxy for your chosen provider. call cbc_proxy_client.get_proxy('ee')`
and it'll return you a proxy that knows what ee's lambda function is,
how to transform any content in a way that is exclusive to ee, and in
future how to parse any response from ee.

The present:

I also cleaned up some duplicate tests.
I'm really not sure about the names of some of these variables - in
particular `cbc_proxy_client` isn't a client - it's more of a java style
factory, where you call a function on it to get the client of your
choice.
2020-11-19 15:50:37 +00:00
Leo Hemsted
0257774cfa add get_earlier_provider_message fn to broadcast_event
replacing get_earlier_provider_messages. The old function returned the
previous references for earlier events for a broadcast_message. However,
these depend on the message sent to a specific provider, so the function
needs to change. It now takes in a provider, and only returns
broadcast_provider_messages sent to that provider. If there are earlier
broadcast_events without a provider_message for the chosen provider, it
raises an exception - you cannot cancel a message if all the previous
events have not been created properly (as we wouldn't know what
references to cancel).
2020-11-19 15:50:37 +00:00
Leo Hemsted
f12c949ae9 create broadcast_provider_message and use id from that instead
(instead of using the id from broadcast_event)

we need every XML blob we send to have a different ID. if we're sending
different XML blobs for each provider, then each one should have a
different identifier. So, instead of taking the identifier from the
broadcast_event, take it from the broadcast_provider_message instead.

Note: We're still going to the broadcast_event for most fields, to
ensure they stay consistent between different providers. The last thing
we want is for different phone networks to get different content
2020-11-19 15:50:37 +00:00
Leo Hemsted
7cc83e04eb move BroadcastProvider from models.py to config.py
It's not something that is tied to a database table, and was causing
circular import issues
2020-11-19 15:50:37 +00:00
Leo Hemsted
bc3512467b send messages to multiple providers
at the moment only EE is enabled (this is set in app.config, but also,
only EE have a function defined for them so even if another provider was
enabled without changing the dict in cbc_proxy.py we won't trigger
anything). this commit just adds wrapper tasks that check what providers
are enabled, and invokes the send function for each provider.

The send function doesn't currently distinguish between providers for
now - as we only have EE set up. in the future we'll want to separate
the cbc_proxy_client into separate clients for separate providers.
Different providers have different lambda functions, and have different
requirements. For example, we know that the two different CBC software
solutions handle references to previous messages differently.
2020-11-19 15:50:37 +00:00
Leo Hemsted
2e665de46d add broadcast provider message table to DB
we need to track the state of sending to different provider separately
(and trigger them off separately, refer to references separately, etc)
2020-11-19 15:50:37 +00:00
Leo Hemsted
3aa602bd6b Merge pull request #3036 from alphagov/cbc-proxy-refactor
Cbc proxy refactor
2020-11-19 15:50:02 +00:00
Tom Byers
a35199a17a Merge pull request #3038 from alphagov/bump-utils
Bump utils to 43.5.1
2020-11-19 13:20:15 +00:00
Pea M. Tyczynska
ffb19346e0 Merge pull request #3039 from alphagov/give-providers-equal-shares-of-traffic
Give sms providers equal shares of traffic
2020-11-19 10:48:56 +00:00
Pea Tyczynska
60bd9a6f82 Give providers equal shares of traffic
This is done on a temporary basis for billing-related reasons.
2020-11-19 10:28:42 +00:00
Tom Byers
2021555a07 Bump utils to 43.5.1
Brings in
https://github.com/alphagov/notifications-utils/pull/807
which removes a hack we had in our email template
to deal with this bug:

https://www.pivotaltracker.com/story/show/161183433

This behaviour is no longer happening so this
removes the hack.

See
https://www.pivotaltracker.com/story/show/161183433/comments/219297211
for evidence of the change in behaviour.
2020-11-18 16:43:22 +00:00
David McDonald
13450c8429 Merge pull request #3032 from alphagov/4xx-callbacks
Log when we don't retry a callback
2020-11-18 12:25:38 +00:00
Leo Hemsted
b72640bf5e refactor cbc proxy and fix tests
moved the lambda invocation to a separate function to keep DRY

asserts on exception types need to be outside of with blocks, or they
won't trip (as the exception will stop execution of the inner with
block). the asserts were also the wrong way round so fixed that.
2020-11-17 13:35:04 +00:00
Leo Hemsted
732c203d3e rename clients to notification_provider_clients
i think it's causing havoc with my attempts to mock stuff in the
`app.clients` directory because it's also accessible at that path. the
name's super vague and doesn't explain what it is anyway
2020-11-17 13:34:58 +00:00
David McDonald
224d9bf35a Log when we don't retry a callback
We don't retry any callbacks when it receives a 4xx status. We should
probably be aware of this happening and at the moment there is nothing
in our logs to easily identify whether the request failed and is being
retried or if it failed and is not being retried. This will enable us to
search our logs easily and figure out how much it's happening.

It's quite likely that we should in the future allow callbacks to retry
if they get a 429 http response (rate limiting) but we should do this in
a smart way (exponential backoff) and so this is a first step to being
aware of how big a problem it is in case we want to do something about
it.
2020-11-17 11:26:32 +00:00
Katie Smith
5924de9c14 Merge pull request #3034 from alphagov/bump-utils-43.5.0
Bump notifications-utils to 43.5.0
2020-11-17 10:16:21 +00:00
Katie Smith
47e427d0a9 Bump notifications-utils to 43.5.0
This version changes the `.fragment_count` method of the
`BaseSMSTemplate` class to take extended GSM characters into account.
2020-11-17 09:46:26 +00:00
Rebecca Law
71f3cf5948 Merge pull request #3027 from alphagov/check-content-length-after-doc-download-complete
Change how we validate the length of templates.
2020-11-16 14:15:13 +00:00
Rebecca Law
2e114b7404 Bump utils requirement 2020-11-16 14:04:37 +00:00
Rebecca Law
171bc74c69 Rename check_character_count method to check_is_message_to_long.
Add different error message for email and text if content is too long.
Use utils version with is_message_too_long method implemented for email templates.
2020-11-09 16:06:57 +00:00
Rebecca Law
5bacfc1df9 Change how we validate the length of templates.
We want to add validation for an email that's too long, that way the user knows why the message is failing. At the moment if an email is too long it will get a technical failure, after the retries fail. This way the email post will get a validation error.

Once this: https://github.com/alphagov/notifications-utils/pull/804 is reverted, we can update the utils version.
2020-11-09 15:54:39 +00:00
Toby Lorne
918bf6d97c Merge pull request #3030 from alphagov/revert-3029-sslmode
Revert "Specify sslmode in Cloud Foundry environment variables"
2020-11-09 12:20:12 +00:00
Toby Lorne
31f845bbff Revert "Specify sslmode in Cloud Foundry environment variables" 2020-11-09 12:04:40 +00:00
Toby Lorne
eb072ce5ba Merge pull request #3029 from alphagov/sslmode
Specify sslmode in Cloud Foundry environment variables
2020-11-09 11:41:54 +00:00
Toby Lorne
cfcc3128c2 db: specify sslmode in Cloud Foundry env
Refer to
https://www.postgresql.org/docs/11/libpq-connect.html#LIBPQ-CONNECT-SSLMODE

GOV.UK PaaS gives us the database URI, and we use the default mode of
postgres auth which prefers a TLS connection instead of a plain TCP
connection

We are now specifying the SSL mode in the URI when establishing our
connection to the database, so that:

* We will not connect to the database via a plaintext connection
* We will verify the database connection against a list of trusted CAs

The RDS CA from which the database's certificate is issued is added into
the Cloud Foundry app container via
925681f19b/manifests/cf-manifest/operations.d/350-diego-cell.yml (L17-L22)

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
Co-authored-by: David <david.mcdonald@digital.cabinet-office.gov.uk>
2020-11-09 10:47:44 +00:00
Toby Lorne
29b6ed427d Merge pull request #3028 from alphagov/link-test-less-often
celery: link test less often
2020-11-06 14:04:49 +00:00
Toby Lorne
00a1ba4b41 celery: link test less often
This is causing the disk of the CBCs to fill up quickly, and their
logrotate seems a bit flakey

Reducing the rate will ensure the disks fill up less often

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
2020-11-06 13:37:39 +00:00
Katie Smith
20b329e50c Merge pull request #3026 from alphagov/pyup-scheduled-update-2020-11-04
Scheduled weekly dependency update for week 44
2020-11-06 13:02:19 +00:00
Katie Smith
4582433fdf Freeze requirements 2020-11-06 10:56:52 +00:00
pyup-bot
b2fe9d0a6f Update sqlalchemy from 1.3.19 to 1.3.20 2020-11-06 10:51:09 +00:00
pyup-bot
87a093845a Update psycopg2-binary from 2.8.5 to 2.8.6 2020-11-06 10:51:09 +00:00
pyup-bot
a3ba4b3e3d Update iso8601 from 0.1.12 to 0.1.13 2020-11-06 10:51:09 +00:00
pyup-bot
73ca396504 Update eventlet from 0.27.0 to 0.29.1 2020-11-06 10:51:09 +00:00
pyup-bot
8cb1797ec4 Update cffi from 1.14.2 to 1.14.3 2020-11-04 13:00:02 +00:00