Commit Graph

8124 Commits

Author SHA1 Message Date
Katie Smith
e737217c3f Merge pull request #3208 from alphagov/check-bcast-service-live
Check if service is in live mode before sending a broadcast
2021-04-16 09:11:53 +01:00
Ben Thorner
c026d16c05 Merge pull request #3210 from alphagov/fix-apply-async
Fix apply_async not working with positional kwargs
2021-04-15 17:31:16 +01:00
Ben Thorner
be02573147 Fix apply_async not working with positional kwargs
Celery's apply_async function accepts 'kwargs' as (get ready to be
confused) either a positional argument, or a keyword argument:

Positional: apply_async(['args'], {'kw': 'args'})

Keyword: apply_async(args=['args'], kwargs={'kw': 'args'})

We rely on the positional form in at least one place [1]. This fixes
the overload of apply_async to cope with both forms, and continue to
pass through any other (confusion time again) keyword args to super(),
such as queue="queue".

Note that we've also decided to stop accepting other positional args,
since this is unnecessarily confusing, and we don't currently rely on
it in our code. This stops it creeping in in future.

[1]: fde927e00e/app/job/rest.py (L186)
2021-04-15 17:21:21 +01:00
Katie Smith
59978fd99a Check if a service is live before sending a broadcast
We only want to send a broadcast if the broadcast message is not stubbed
and the service is live at the point at which the broadcast event should
be created. This is to prevent the situation where a broadcast service is
switched to live / trial mode in between the message being created and
approved (we log an error if this happens).

A stubbed broadcast message with a trial mode service at the point of
approval is not an issue - trial mode services can approve their own
broadcasts. In this situation, we don't create the broadcast event but
also don't need to log an error.
2021-04-15 15:01:32 +01:00
Katie Smith
df82b514d1 Don't create broadcast event unless necessary
If we're not going to send a broadcast, we don't need to create the
BroadcastEvent in the database. The BroadcastMessage contains all the
data we need - the BroadcastEvent is not used.

Not creating the event when we won't send the broadcast (e.g. when the
broadcast message was created when the service was in trial mode) adds
an extra layer of security.
2021-04-15 14:55:45 +01:00
Ben Thorner
fde927e00e Merge pull request #3205 from alphagov/celery-consistency-tweaks
Small refactor for new Celery / StatsD code
2021-04-15 11:40:42 +01:00
David McDonald
bbc2de10de Merge pull request #3206 from alphagov/sms-receipts
Add sms receipts app manifest config
2021-04-15 09:17:23 +01:00
Ben Thorner
f85dad5acf Merge pull request #3203 from alphagov/remove-statsd-decorators
Remove redundant @statsd timing decorators
2021-04-14 10:04:04 +01:00
David McDonald
39d7c347b0 Add sms receipts app manifest config
This app will replace the `notify-api-sms-callbacks` app as it is an app
that handles receipts, not callbacks.

After this and the corresponding concourse PR is merged and deployed (at which
point we will have two apps sharing the traffic) we can then put in PRs
to remove the `notify-api-sms-callbacks` app.

There is a chance it could all be done as a single PR (or at least one
for the API and one for the concourse pipelines) but I'm playing it safe
and doing it as a very clear two step process just in case.
2021-04-13 18:03:05 +01:00
Ben Thorner
5eb265138b Remove unnecessary statsd_client parameter
It turns out this is available from the app object [1], and we were
already assuming this in the tests.

[1]: 48c6c822e8/notifications_utils/clients/statsd/statsd_client.py (L52)
2021-04-13 15:12:55 +01:00
Ben Thorner
ec6d87cd0f Simplify argument passing in apply_async
This avoids the need to keep in-sync with any future changes to the
signature, and reduces the amount of irrelevant code to read.
2021-04-13 15:12:45 +01:00
David McDonald
2e6d761691 Merge pull request #3204 from alphagov/broadcast-envars
Broadcast envars
2021-04-12 17:25:15 +01:00
David McDonald
4437d60dd7 Only give broadcasts worker IAM creds for CBC proxy
There is no need to give it to any of the other workers and so the fewer
instances that have these creds the better.

You can verify this works by running
```
CF_APP=notify-api CF_SPACE=preview make generate-manifest
```

vs

```
CF_APP=notify-delivery-worker-broadcasts CF_SPACE=preview make generate-manifest
```
2021-04-12 17:05:42 +01:00
David McDonald
295162c81d Move CBC proxy enable check
This change will make our development environments closer to production
even if they aren't hooked up to the CBC proxy lambda functions.

Now in development, we will create the broadcast event and create tasks
for each broadcast provider event. We will still not create actual
broadcast provider message rows in the DB and talk to the CBC proxies.

This should be helpful in development to catch any issues we introduce
to do with sending broadcast messaging. In time we may wish to have some
fake CBC proxies in the AWS tools account that we can interact with to
make it even more realistic.
2021-04-12 17:05:41 +01:00
Ben Thorner
e3e067c795 Remove redundant @statsd timing decorators
These are superseded by timing task execution generically in the
NotifyTask superclass [1]. Note that we need to wait until we've
gathered enough data under the new metrics before removing these.

[1]: https://github.com/alphagov/notifications-api/pull/3201#pullrequestreview-633549376
2021-04-12 15:19:18 +01:00
Ben Thorner
3e507eea55 Merge pull request #3201 from alphagov/revamp-celery-stats
Migrate towards new metrics for Celery tasks
2021-04-12 15:04:37 +01:00
Ben Thorner
37f91e0214 Add tests for apply_async injecting request_id 2021-04-12 14:50:55 +01:00
Ben Thorner
df6e27d8fd Add test for extracting request_id in __call__
Tasks will fail if we leave the kwarg in, so I think it's quite
important that we test this works. We don't cover this in any other
test because we call the task functions directly, so the request_id
kwarg doesn't get injected beforehand.
2021-04-12 14:50:53 +01:00
Ben Thorner
8954cec5a1 Add tests for celery task superclass
This requires upgrading freezegun, as time.monotonic wasn't frozen
by v1.0. Note that we need to explicitly specify the base class for
the task in the test, the reason for which is quite subtle:

- Normally, by using the 'notify_api' fixture, the base class is set
to NotifyTask automatically by running app.create_app [1].

- However, when run alongside other tests, the imports of files with
other celery tasks cause the base class to be instantiated and cached
as the default Celery one. This means none of our tests actually use
our custom superclass when testing tasks.

Because we can't run 'apply_async' directly (since this would require
an actual Celery broker), we need to manually push/pop the request
Context that's normally done as part of sending a task.

Note also that we use a UUID as the name for a task, since these are
global. We want to avoid the task polluting other tests in future,
as well as make it clear the task is being reused.

[1]: dea5828d0e/app/__init__.py (L113)
2021-04-12 14:50:02 +01:00
David McDonald
514afeb6f3 Set CBC_PROXY_ENABLED per environment, not dynamically
Previously we looked at whether an environment was given AWS access keys
to decide if the `CBC_PROXY_ENABLED` setting was true. Given that all
environments (apart from development) are currently hooked up to our AWS
cell broadcast accounts, it doesn't feel too useful to have a dynamic
switch when we can just hardcode it.

On top of that, this lays the groundwork for having `CBC_PROXY_ENABLED`
to be True even if an individual application doesn't have the CBC PROXY
aws access keys as in future only the broadcasts worker will have the
AWS keys but all the other apps will know that cell broadcasting is
indeed turned on for that environment.
2021-04-09 11:56:00 +01:00
Katie Smith
c3d9aca43a Remove redundant comment
We no longer have a noop client
2021-04-09 11:54:32 +01:00
Katie Smith
32f499c802 Fix setting the CELERYD_PREFETCH_MULTIPLIER variable for broadcasts
This was not being set correctly in the manifest for the
notify-delivery-worker-broadcasts worker.
2021-04-09 11:54:31 +01:00
Ben Thorner
ab8dd6d52c Duplicate metrics to StatsD for Celery tasks
Previously we used a '@statsd' decorator to time and count Celery
tasks [1]. Using a decorator isn't ideal since we need to remember
to add it to every task we define. In addition, it's not possible
to use data like the task name and queue.

In order to avoid breaking existing stats, this duplicates them as
new StatsD metrics until we have sufficient data to update dashboards
using the old ones. Using the CeleryTask superclass to send metrics
avoids a future maintenance overhead, and means we can include more
useful data in the StatsD metric. Note that the new metrics will sit
in StatsD until we add a mapping for them [2].

StatsD automatically produces a 'count' stat for timing metrics, so
we don't need to increment a separate counter for successful tasks.

[1]: dea5828d0e/app/celery/tasks.py (L65)
[2]: https://github.com/alphagov/notifications-aws/blob/master/paas/statsd/statsd-mapping.yml
2021-04-08 18:02:53 +01:00
Ben Thorner
248f5a0708 Include queue name in Celery task logs
This is mainly so we can use it in the new metrics we send to StatsD
in the following commits, but it should also be useful in the logs.
I've taken the opportunity to make the log format consistent between
success / failure, and with our Template Preview app [1].

[1]: f456433a5a/app/celery/celery.py (L19)
2021-04-08 18:02:51 +01:00
Leo Hemsted
4a2e47b118 Merge pull request #3202 from alphagov/env-fix
fix environment var checking
2021-04-08 13:44:28 +01:00
Ben Thorner
19be4faf45 Switch to monotonic time for task logs
This matches the approach we take in utils [1]. Monotonic time is
better because it avoids weird negative results due to clock shift.

[1]: 5d18ebd796/notifications_utils/statsd_decorators.py (L14)
2021-04-08 13:00:24 +01:00
Ben Thorner
054205835b Remove unused metric for SQS apply duration
This was added as part of a wider performance investigation [1]. I
checked with Leo, who made the change, and while the other metrics
are still be useful, there's no reason to keep this one.

[1]: 6e32ca5996 (diff-76936416943346b5f691dac57a64acebc6a1227293820d1d9af4791087c9fb9eR23)
2021-04-08 13:00:21 +01:00
Leo Hemsted
4a5b1c23bd only send zendesk P1 for alerts
we don't need to be re-notified when someone clicks cancel
2021-04-08 12:22:18 +01:00
Leo Hemsted
9bd8c0239c look for 'live', not 'production'
config['NOTIFY_ENVIRONMENT'] is hardcoded to `'live'` in the Live config
class. The values as seen on the environment which we send real messages
from:

```
>>> json.loads(os.environ['VCAP_APPLICATION'])['space_name']  # what cloudfoundry sets
'production'
>>> os.environ['NOTIFY_ENVIRONMENT']  # we set this from cloudfoundry
'production'
>>> current_app.config['NOTIFY_ENVIRONMENT']  # hardcoded in the Live config
'live'
>>> current_app.config['NOTIFICATION_QUEUE_PREFIX']  # pulled from env var of same name
'live'
>>> current_app.config['ENV']  # this is an unrelated flask variable
'production'
```
2021-04-08 12:17:22 +01:00
David McDonald
dea5828d0e Merge pull request #3198 from alphagov/training-mode-preview
Don't send real broadcasts for preview training mode
2021-04-07 11:40:11 +01:00
David McDonald
42b3f13538 Don't send real broadcasts for preview training mode
We previously allowed MNOs to approve a broadcast themselves in training
mode and have it go out to their integration environment as per
https://github.com/alphagov/notifications-api/pull/3114

However, we want to remove this use case as it means we have to support
configuration for training mode services to do things like pick a
channel and send out alerts which we definteily don't want to do in
production.

By making this change, we reduce the chance of a single bug meaning an
alert will go out in prod that shouldn't.

Note, will also make it harder for development environment testing but I
think it is still worth it as https://www.pivotaltracker.com/story/show/177584959
will make it much harder in our code to allow some environments to send
alerts whilst in training mode.
2021-04-06 14:21:57 +01:00
Leo Hemsted
df97b28c57 Merge pull request #3191 from alphagov/p1-broadcast
send a p1 when a broadcast goes out
2021-04-06 13:40:04 +01:00
Leo Hemsted
df393e36c5 send a p1 when a broadcast goes out on production
it's important to keep tabs on when these things leave our system.
Sending a zendesk ticket that triggers a P1 is probably our simplest way
of notifying the team when this happens (it's what we do with out of
hours emergencies on the admin app too). We don't have any direct
pagerduty integrations from the api app, but we already have the zendesk
client hooked up.

After broadcasts go live, we may want to change this to a P2 (but even
then, there's arguments for keeping it P1 to start with I think).

Don't cause a P1 if it goes out on staging as that might be MNOs testing.
2021-04-06 11:32:19 +01:00
Rebecca Law
dc8fa25f19 Merge pull request #3196 from alphagov/update-rates
Add new SMS rate for April 1 2021.
2021-04-01 11:23:25 +01:00
Rebecca Law
a1ae220c9f Add new SMS rate for April 1 2021.
Downgrade script isn't necessary.
2021-04-01 08:20:29 +01:00
Ben Thorner
e5fabaa08a Merge pull request #3190 from alphagov/log-invalid-inbound-service
Log service ID for invalid inbound SMS
2021-03-31 10:33:54 +01:00
Ben Thorner
86b6217cf3 Log service ID for invalid inbound SMS
This could help identify issues with inbound SMS for a service.
2021-03-31 10:22:34 +01:00
Rebecca Law
220047628b Merge pull request #3193 from alphagov/populate-annual-billing-2021
Populate annual billing for 2021
2021-03-31 07:58:16 +01:00
Rebecca Law
0165ed9cda use true as the default. 2021-03-30 12:46:28 +01:00
Pea Tyczynska
1a6ad92490 Merge pull request #3194 from alphagov/allow-double-hyphens-in-email-domains
Bump utils to allow double hyphens in email address domain
2021-03-30 11:06:08 +01:00
Rebecca Law
da8a7a8db1 Avoid key errors by setting the year_start with 2020 or 2021
Remove db.create_service_with_organisation method
Update comment in command
2021-03-30 09:08:04 +01:00
Katie Smith
2bf58c8a15 Merge pull request #3187 from alphagov/remove-unecessary-code
Simplify send_delivery_status_to_service code
2021-03-30 08:59:57 +01:00
Katie Smith
3bfd084a77 Simplify send_delivery_status_to_service code
Now that https://github.com/alphagov/notifications-api/pull/3184 has
been deployed for a while, the `send_delivery_status_to_service` task will
always have `template_id` and `template_version` being passed in. This
means we don't need to check if those fields are there.
2021-03-30 08:50:42 +01:00
Pea Tyczynska
a3aad24fe1 Bump utils to allow double hyphens in email address domain
It was requested by our user and it is an allowed domain format
with Amazon SES, so we started allowing it in our validation.
2021-03-29 17:53:29 +01:00
David McDonald
d04587623f Merge pull request #3192 from alphagov/delete-canary
Remove the emergency alerts canary
2021-03-29 14:57:41 +01:00
Rebecca Law
7da5abc17b The free sms allowances are changing for the financial year starting
April 1 2021.

In this PR there is a command to set annual_billing for all active
services with the the new defaults.

The new method `set_default_free_allowance_for_service` will also be
called in a PR to follow that will set a services free allowance to the
default if the organisation for the service is changed.
2021-03-29 13:32:00 +01:00
David McDonald
6d410daae4 Remove the emergency alerts canary
See https://github.com/alphagov/notifications-broadcasts-infra/pull/197
for why we no longer need this and we get to delete some code!
2021-03-26 18:31:53 +00:00
Rebecca Law
47c2ced614 Merge pull request #3185 from alphagov/set-crown-if-none-for-letter-rates
Quick fix for dealing with missing crown on a service sending letters
2021-03-26 14:04:45 +00:00
Rebecca Law
057c4e4568 Quick fix to ensure that billing doesn't fail if the crown is not set
for the service.

The letters rates for cronw and non crown are the same. It would be nice
to remove the need for crown but for now this is a quick fix.
2021-03-25 08:42:46 +00:00
Pea Tyczynska
1903e8f268 Merge pull request #3168 from alphagov/client-reference
If client reference not given, try to get it from personalisation
2021-03-24 15:34:34 +00:00