Commit Graph

4636 Commits

Author SHA1 Message Date
David McDonald
2d345afab8 Merge pull request #3223 from alphagov/earlier-alerts
Make check-if-letters-still-in-created run at 7am
2021-04-27 17:29:12 +01:00
Rebecca Law
be797ea513 Merge pull request #3215 from alphagov/add-unique-key-annual-billing
Add unique key on annual_billing for service_id + financial_year_start.
2021-04-27 14:14:10 +01:00
David McDonald
f194231d87 Make check-if-letters-still-in-created run at 7am
If this alert goes off in the morning, it usually means we need to do
something, ideally quite quickly as it indicates a potential problem
with the sending of letters over to DVLA the night before.

Given this goes off at 9am at the moment, but actually some people start
work earlier, if we alert at 7am it means it will likely be looked at
earlier in the day and we can potentially fix any problems with letters
sooner than later.
2021-04-27 11:26:18 +01:00
Ben Thorner
99bc29418e Move request_id injection into send_task override
This applies the same change we made in other apps [1][2]. Adding
the override here is special, though, because it means the others
will now get triggered, since this app is the start of the chain
of tasks for a request. We will also retain existing request_id
tracing for tasks within this app, since "apply_async" calls the
"send_task" method internally, which is the one we're overriding.

[1]: 6f3c118a1e
[2]: 2e08b7aa95
2021-04-27 10:35:21 +01:00
Rebecca Law
5f26d16915 Merge pull request #3218 from alphagov/precomplied-letter-transation
Introduce transaction for precompiled letters
2021-04-26 13:10:31 +01:00
Chris Hill-Scott
f8bca5765d Validate length of broadcast templates on creation
This is a belt-and-braces check because the admin app already checks
this. But since we do it for SMS already it makes sense to replicate it
for broadcast templates.
2021-04-22 17:11:31 +01:00
Chris Hill-Scott
7c6ae40034 Normalise broadcast content before validating length
This changes the content length validation of the internal API to match
the validation of the public broadcast API[1].

This removes the length check from JSONSchema, which isn’t sophisticated
enough to deal with things like normalising newlines or handling
different encodings.

The admin app should catch these errors before they’re raised here, but
it’s best to be belt and braces.

1.7ab0403ae7/app/v2/broadcast/post_broadcast.py (L53-L63)
2021-04-22 14:42:54 +01:00
Rebecca Law
cb8ec9a4f7 Wrap the saving the notification and uploading the precompiled letter to
s3 in a transation.

If the upload to s3 fails the notification will not be saved to the
database.
2021-04-22 13:39:41 +01:00
Katie Smith
7ab0403ae7 Merge pull request #3216 from alphagov/bcast-logging
Add detail to broadcast logging messages
2021-04-21 09:03:38 +01:00
Rebecca Law
fbd231cefd Merge pull request #3179 from alphagov/add-international-firetext-api-key
Add international API key for firetext
2021-04-21 08:49:11 +01:00
Katie Smith
e6357c91c9 Add more details to messages in send_broadcast_provider_message task
This ensures that the log messages both contain broadcast_event id and
broadcast_provider_message id. It also removes the broadcast_event
reference since this isn't particularly useful in helping to find an
event.
2021-04-20 15:34:49 +01:00
Katie Smith
c9c4bd8b44 Clarify log line when sending a link test
It wasn't clear what the ID in the message was. It's not possible to add
more details to the message - we don't create a broadcast message or
event for a link test.
2021-04-20 14:54:53 +01:00
Pea Tyczynska
38af26cc78 Merge pull request #3199 from alphagov/admin-cancel-broadcast
Allow platform admins to cancel broadcasts.
2021-04-20 14:47:45 +01:00
Rebecca Law
f3fdd3b09b Add internation api key for firetext.
We want to start using Firetext for sending international SMS. They
require us to use a different API key for international SMS because it
requires a new code path to switch the sender ID to something that the
country will accept.
This PR does not include switching the sender of international SMS to
Firetext but sets us up to do so.
2021-04-20 13:58:55 +01:00
Rebecca Law
a637c8eb92 Add unique key on annual_billing for service_id + financial_year_start.
This is an extra precaution for the table to ensure data integrity. Since we only update/insert the data using the annual_billing_dao methods the integrity is in tact. I've check the data on preview, staging and prod there are no violations of this unique key.
2021-04-20 13:42:20 +01:00
Pea Tyczynska
fce6a2d8dc Allow platform admins to cancel broadcasts.
Also update error message for when someone does not have permissions.

The message referenced approving broadcasts specifically, whereas
people would also see it if they try to cancel or reject
broadcast without permission.

Why we allow platform admins to cancel broadcasts:
we do this so they can react quickly if a broadcast was
approved by accident.
2021-04-20 12:27:38 +01:00
Rebecca Law
ae57521b39 Simplify the get_free_sms_fragment limit for the case when the row is
missing, by setting the free allowance to the default.
2021-04-19 13:29:04 +01:00
Rebecca Law
bcd1939179 Merge pull request #3212 from alphagov/update-zendesk-message
Update the Zendesk ticket content for `check_if_letters_still_in_created`
2021-04-19 11:36:28 +01:00
Rebecca Law
91542ad33e Merge pull request #3197 from alphagov/update-annual-billing-if-org-changes
Set default free allow when organisation for a service changes
2021-04-19 11:34:24 +01:00
Rebecca Law
d4009ffc52 Rename database management functions.
Rename @transactional to @autocommit.
Rename nested_transaction to tranaction.
2021-04-19 10:56:00 +01:00
Rebecca Law
34a378a60e Update the Zendesk ticket content for
`check_if_letters_still_in_created`

The message to Zendesk includes a list of notification ids, this isn't
really necessary and is included in the run book. Creation of the
Zendesk ticket can fail if the message is too long, removing the list of
ids can prevent that from happening.
2021-04-19 10:47:25 +01:00
Katie Smith
e737217c3f Merge pull request #3208 from alphagov/check-bcast-service-live
Check if service is in live mode before sending a broadcast
2021-04-16 09:11:53 +01:00
Ben Thorner
be02573147 Fix apply_async not working with positional kwargs
Celery's apply_async function accepts 'kwargs' as (get ready to be
confused) either a positional argument, or a keyword argument:

Positional: apply_async(['args'], {'kw': 'args'})

Keyword: apply_async(args=['args'], kwargs={'kw': 'args'})

We rely on the positional form in at least one place [1]. This fixes
the overload of apply_async to cope with both forms, and continue to
pass through any other (confusion time again) keyword args to super(),
such as queue="queue".

Note that we've also decided to stop accepting other positional args,
since this is unnecessarily confusing, and we don't currently rely on
it in our code. This stops it creeping in in future.

[1]: fde927e00e/app/job/rest.py (L186)
2021-04-15 17:21:21 +01:00
Katie Smith
59978fd99a Check if a service is live before sending a broadcast
We only want to send a broadcast if the broadcast message is not stubbed
and the service is live at the point at which the broadcast event should
be created. This is to prevent the situation where a broadcast service is
switched to live / trial mode in between the message being created and
approved (we log an error if this happens).

A stubbed broadcast message with a trial mode service at the point of
approval is not an issue - trial mode services can approve their own
broadcasts. In this situation, we don't create the broadcast event but
also don't need to log an error.
2021-04-15 15:01:32 +01:00
Katie Smith
df82b514d1 Don't create broadcast event unless necessary
If we're not going to send a broadcast, we don't need to create the
BroadcastEvent in the database. The BroadcastMessage contains all the
data we need - the BroadcastEvent is not used.

Not creating the event when we won't send the broadcast (e.g. when the
broadcast message was created when the service was in trial mode) adds
an extra layer of security.
2021-04-15 14:55:45 +01:00
Ben Thorner
fde927e00e Merge pull request #3205 from alphagov/celery-consistency-tweaks
Small refactor for new Celery / StatsD code
2021-04-15 11:40:42 +01:00
Ben Thorner
f85dad5acf Merge pull request #3203 from alphagov/remove-statsd-decorators
Remove redundant @statsd timing decorators
2021-04-14 10:04:04 +01:00
Rebecca Law
93908bacda New strategy for transaction management.
Introduce a contextmanger function to handle exceptions and nested
transactions. Using the nested_transaction will start a
nested transaction with `db.session.begin_nested`, once the nested
transaction is complete the commit will happen.
`@transactional` has been updated to commit unless in a nested
transaction.
2021-04-14 07:04:17 +01:00
Rebecca Law
cf35135605 Adding @nested_transactional for transactions that require more than one
db update/insert.

Using a savepoint for the multiple transactions allows us to rollback if
there is an error when executing the second db transaction.
However, this does add a bit of complexity. Developers need to manage
the db session when calling multiple nested tranactions.

Unit tests have been added to test this functionality and some end to
end tests have been done to make sure all transactions are rollback if
there is an exception while executing the transaction.
2021-04-14 07:03:57 +01:00
Rebecca Law
9a03e579d6 When a service is created add the default annual billing for the service.
This will need to be merged before https://github.com/alphagov/notifications-admin/pull/3855, it will be that until the admin PR is merged the annual billing will be set twice, but that's not an issue.
2021-04-14 07:03:57 +01:00
Rebecca Law
69e5ddae4f When a service is associated with a organisation set the free allowance to
the default free allowance for the organisation type.

The update/insert for the default free allowance is done in a separate
transaction. Updates to services need to happen in a transaction to
trigger the insert into the ServicesHistory table. For that reason the
call to set_default_free_allowance_for_service is done after the service
is updated.
I've added a try/except around the set_default_free_allowance_for_service call to ensure we still get the update to the service but get an exception log if the update to annual_billing fails. I believe it's important to preserve the update to the service in the unlikely event that the annual_billing upsert fails.
2021-04-14 07:03:57 +01:00
Ben Thorner
5eb265138b Remove unnecessary statsd_client parameter
It turns out this is available from the app object [1], and we were
already assuming this in the tests.

[1]: 48c6c822e8/notifications_utils/clients/statsd/statsd_client.py (L52)
2021-04-13 15:12:55 +01:00
Ben Thorner
ec6d87cd0f Simplify argument passing in apply_async
This avoids the need to keep in-sync with any future changes to the
signature, and reduces the amount of irrelevant code to read.
2021-04-13 15:12:45 +01:00
David McDonald
2e6d761691 Merge pull request #3204 from alphagov/broadcast-envars
Broadcast envars
2021-04-12 17:25:15 +01:00
David McDonald
295162c81d Move CBC proxy enable check
This change will make our development environments closer to production
even if they aren't hooked up to the CBC proxy lambda functions.

Now in development, we will create the broadcast event and create tasks
for each broadcast provider event. We will still not create actual
broadcast provider message rows in the DB and talk to the CBC proxies.

This should be helpful in development to catch any issues we introduce
to do with sending broadcast messaging. In time we may wish to have some
fake CBC proxies in the AWS tools account that we can interact with to
make it even more realistic.
2021-04-12 17:05:41 +01:00
Ben Thorner
e3e067c795 Remove redundant @statsd timing decorators
These are superseded by timing task execution generically in the
NotifyTask superclass [1]. Note that we need to wait until we've
gathered enough data under the new metrics before removing these.

[1]: https://github.com/alphagov/notifications-api/pull/3201#pullrequestreview-633549376
2021-04-12 15:19:18 +01:00
Ben Thorner
3e507eea55 Merge pull request #3201 from alphagov/revamp-celery-stats
Migrate towards new metrics for Celery tasks
2021-04-12 15:04:37 +01:00
David McDonald
514afeb6f3 Set CBC_PROXY_ENABLED per environment, not dynamically
Previously we looked at whether an environment was given AWS access keys
to decide if the `CBC_PROXY_ENABLED` setting was true. Given that all
environments (apart from development) are currently hooked up to our AWS
cell broadcast accounts, it doesn't feel too useful to have a dynamic
switch when we can just hardcode it.

On top of that, this lays the groundwork for having `CBC_PROXY_ENABLED`
to be True even if an individual application doesn't have the CBC PROXY
aws access keys as in future only the broadcasts worker will have the
AWS keys but all the other apps will know that cell broadcasting is
indeed turned on for that environment.
2021-04-09 11:56:00 +01:00
Katie Smith
c3d9aca43a Remove redundant comment
We no longer have a noop client
2021-04-09 11:54:32 +01:00
Ben Thorner
ab8dd6d52c Duplicate metrics to StatsD for Celery tasks
Previously we used a '@statsd' decorator to time and count Celery
tasks [1]. Using a decorator isn't ideal since we need to remember
to add it to every task we define. In addition, it's not possible
to use data like the task name and queue.

In order to avoid breaking existing stats, this duplicates them as
new StatsD metrics until we have sufficient data to update dashboards
using the old ones. Using the CeleryTask superclass to send metrics
avoids a future maintenance overhead, and means we can include more
useful data in the StatsD metric. Note that the new metrics will sit
in StatsD until we add a mapping for them [2].

StatsD automatically produces a 'count' stat for timing metrics, so
we don't need to increment a separate counter for successful tasks.

[1]: dea5828d0e/app/celery/tasks.py (L65)
[2]: https://github.com/alphagov/notifications-aws/blob/master/paas/statsd/statsd-mapping.yml
2021-04-08 18:02:53 +01:00
Ben Thorner
248f5a0708 Include queue name in Celery task logs
This is mainly so we can use it in the new metrics we send to StatsD
in the following commits, but it should also be useful in the logs.
I've taken the opportunity to make the log format consistent between
success / failure, and with our Template Preview app [1].

[1]: f456433a5a/app/celery/celery.py (L19)
2021-04-08 18:02:51 +01:00
Ben Thorner
19be4faf45 Switch to monotonic time for task logs
This matches the approach we take in utils [1]. Monotonic time is
better because it avoids weird negative results due to clock shift.

[1]: 5d18ebd796/notifications_utils/statsd_decorators.py (L14)
2021-04-08 13:00:24 +01:00
Ben Thorner
054205835b Remove unused metric for SQS apply duration
This was added as part of a wider performance investigation [1]. I
checked with Leo, who made the change, and while the other metrics
are still be useful, there's no reason to keep this one.

[1]: 6e32ca5996 (diff-76936416943346b5f691dac57a64acebc6a1227293820d1d9af4791087c9fb9eR23)
2021-04-08 13:00:21 +01:00
Leo Hemsted
4a5b1c23bd only send zendesk P1 for alerts
we don't need to be re-notified when someone clicks cancel
2021-04-08 12:22:18 +01:00
Leo Hemsted
9bd8c0239c look for 'live', not 'production'
config['NOTIFY_ENVIRONMENT'] is hardcoded to `'live'` in the Live config
class. The values as seen on the environment which we send real messages
from:

```
>>> json.loads(os.environ['VCAP_APPLICATION'])['space_name']  # what cloudfoundry sets
'production'
>>> os.environ['NOTIFY_ENVIRONMENT']  # we set this from cloudfoundry
'production'
>>> current_app.config['NOTIFY_ENVIRONMENT']  # hardcoded in the Live config
'live'
>>> current_app.config['NOTIFICATION_QUEUE_PREFIX']  # pulled from env var of same name
'live'
>>> current_app.config['ENV']  # this is an unrelated flask variable
'production'
```
2021-04-08 12:17:22 +01:00
David McDonald
42b3f13538 Don't send real broadcasts for preview training mode
We previously allowed MNOs to approve a broadcast themselves in training
mode and have it go out to their integration environment as per
https://github.com/alphagov/notifications-api/pull/3114

However, we want to remove this use case as it means we have to support
configuration for training mode services to do things like pick a
channel and send out alerts which we definteily don't want to do in
production.

By making this change, we reduce the chance of a single bug meaning an
alert will go out in prod that shouldn't.

Note, will also make it harder for development environment testing but I
think it is still worth it as https://www.pivotaltracker.com/story/show/177584959
will make it much harder in our code to allow some environments to send
alerts whilst in training mode.
2021-04-06 14:21:57 +01:00
Leo Hemsted
df393e36c5 send a p1 when a broadcast goes out on production
it's important to keep tabs on when these things leave our system.
Sending a zendesk ticket that triggers a P1 is probably our simplest way
of notifying the team when this happens (it's what we do with out of
hours emergencies on the admin app too). We don't have any direct
pagerduty integrations from the api app, but we already have the zendesk
client hooked up.

After broadcasts go live, we may want to change this to a P2 (but even
then, there's arguments for keeping it P1 to start with I think).

Don't cause a P1 if it goes out on staging as that might be MNOs testing.
2021-04-06 11:32:19 +01:00
Ben Thorner
86b6217cf3 Log service ID for invalid inbound SMS
This could help identify issues with inbound SMS for a service.
2021-03-31 10:22:34 +01:00
Rebecca Law
220047628b Merge pull request #3193 from alphagov/populate-annual-billing-2021
Populate annual billing for 2021
2021-03-31 07:58:16 +01:00
Rebecca Law
0165ed9cda use true as the default. 2021-03-30 12:46:28 +01:00