We found that non-templated broadcast messages weren’t having their
content normalised before saving into an event.
This means that stuff like `\r\n` and curly quotes were being passed
through to the CBC proxy.
This commit firstly changes templated events to use
`str(BroadcastMessageTemplate)` to normalise the content, because it’s
non-obvious that calling
`BroadcastMessageTemplate.content_with_placeholders_filled_in` also
normalises content.
Then it changes the non-templated route to also call
`str(BroadcastMessageTemplate)`, where previously it was passing the
content straight through.
We want to replace the value `None` for
service.allowed_broadcast_provider with the value of "all". As a first
step, we need to allow both values. Once notifications-admin has been
changed to pass through "all" and all the data in the database has been
updated, we can update the code to stop supporting both values.
This adds a type table for broadcast providers, which is the pattern we
follow with our models (e.g. we have a `broadcast_channel_types` table).
As well as the four providers, the migration populates it with `all`
which is the value that will replace `null` in a later change.
It should be safe to add the foreign key constraint to the
`service_broadcast_settings` in the same migration since the column is
still nullable and we don't have data in that column that is not in the
types table.
DVLA would like to be able to identify letters sent by the Insolvency
Service, so we are changing the zipfile name. They need all zipfile
names to have the same structure, so we can't just add a marker to files
sent by that service - we have to change all filenames.
The new format is like this:
`{NOTIFY}.{DATE}.{SEQUENCE_ID}.{UNIQUE_ID}.{SERVICE_ID}.{ORG_NAME}.{EXTENSION}`
Second attempt [1]. This increases the threshold so:
- It's a more substantial amount of money lost (£16).
- It's 10% of the minimum free allowance for a service.
- It's greater than the threshold we have for TV numbers (500).
Having a higher threshold for this alert will help prevent wasted
effort investigating more negligible failures, and reduces the
ambiguity of whether we should take action: we should.
[1]: https://github.com/alphagov/notifications-api/pull/3221
When deploying to paas the database postgres environment variables are set using VCAP_SERVICES provided by PaaS. When we start up the app and set the properties we need to replace the postgres string with postgresql for the app to start up properly.
This wasn't caught locally or with the unit tests because we were setting this property with postgresql.
`service_ids_to_purge` is a list of `row` object rather than a list of `UUID`.
NOTE: db.session.query(Service).filter(Service.id.notin_(services_with_data_retention)).all() would have also worked. It seems that only selecting attributes from the db.Model has caused the change.
- sqlalchemy.sql.expression.case must include an else statement.
- clearly define list of columns for inbound_sms_history insert, getting the list from InboundSmsHistory.__table__.c was causing data type errors.
- remove relationships when not needed, the foreign key relationship is established in the creation of the column. This will get rid of the warnings referenced here: http://sqlalche.me/e/14/qzyx.
- update queries now that he user relationship in ServiceUser db model has been removed.
- move the check that a template is archived to the view instead of the dao method. The check was clearing the session before the version history could be done.
Deleting notifications in the night tasks still needs to be
investigated. The raw sql is causing an error.
If this alert goes off in the morning, it usually means we need to do
something, ideally quite quickly as it indicates a potential problem
with the sending of letters over to DVLA the night before.
Given this goes off at 9am at the moment, but actually some people start
work earlier, if we alert at 7am it means it will likely be looked at
earlier in the day and we can potentially fix any problems with letters
sooner than later.
This applies the same change we made in other apps [1][2]. Adding
the override here is special, though, because it means the others
will now get triggered, since this app is the start of the chain
of tasks for a request. We will also retain existing request_id
tracing for tasks within this app, since "apply_async" calls the
"send_task" method internally, which is the one we're overriding.
[1]: 6f3c118a1e
[2]: 2e08b7aa95
`service_ids_to_purge` is a list of `row` object rather than a list of `UUID`.
NOTE: db.session.query(Service).filter(Service.id.notin_(services_with_data_retention)).all() would have also worked. It seems that only selecting attributes from the db.Model has caused the change.
- sqlalchemy.sql.expression.case must include an else statement.
- clearly define list of columns for inbound_sms_history insert, getting the list from InboundSmsHistory.__table__.c was causing data type errors.
- remove relationships when not needed, the foreign key relationship is established in the creation of the column. This will get rid of the warnings referenced here: http://sqlalche.me/e/14/qzyx.
- update queries now that he user relationship in ServiceUser db model has been removed.
- move the check that a template is archived to the view instead of the dao method. The check was clearing the session before the version history could be done.
Deleting notifications in the night tasks still needs to be
investigated. The raw sql is causing an error.
This is a belt-and-braces check because the admin app already checks
this. But since we do it for SMS already it makes sense to replicate it
for broadcast templates.
This changes the content length validation of the internal API to match
the validation of the public broadcast API[1].
This removes the length check from JSONSchema, which isn’t sophisticated
enough to deal with things like normalising newlines or handling
different encodings.
The admin app should catch these errors before they’re raised here, but
it’s best to be belt and braces.
1.7ab0403ae7/app/v2/broadcast/post_broadcast.py (L53-L63)
This ensures that the log messages both contain broadcast_event id and
broadcast_provider_message id. It also removes the broadcast_event
reference since this isn't particularly useful in helping to find an
event.
It wasn't clear what the ID in the message was. It's not possible to add
more details to the message - we don't create a broadcast message or
event for a link test.
We want to start using Firetext for sending international SMS. They
require us to use a different API key for international SMS because it
requires a new code path to switch the sender ID to something that the
country will accept.
This PR does not include switching the sender of international SMS to
Firetext but sets us up to do so.
This is an extra precaution for the table to ensure data integrity. Since we only update/insert the data using the annual_billing_dao methods the integrity is in tact. I've check the data on preview, staging and prod there are no violations of this unique key.
Also update error message for when someone does not have permissions.
The message referenced approving broadcasts specifically, whereas
people would also see it if they try to cancel or reject
broadcast without permission.
Why we allow platform admins to cancel broadcasts:
we do this so they can react quickly if a broadcast was
approved by accident.
While both of these are integrity errors (since we should never
reach this point in the code + data), this just means the original
method comment is still relevant to what immediately follows it.
This mirrors the check we do for jobs, which are also a high-impact
task [1]. While this shouldn't be possible, just like other checks
we're adding it here to be doubly certain.
[1]: 3d71815956/app/celery/tasks.py (L74)
We only actually use this when the data we're working with is in an
unexpected state, which is unrelated to the CBC Proxy. Using this
name also means we can re-use this exception in the next commits.
Note that we may still care if a broadcast message has expired, since
it's not expected that someone would send one in this condition.
`check_if_letters_still_in_created`
The message to Zendesk includes a list of notification ids, this isn't
really necessary and is included in the run book. Creation of the
Zendesk ticket can fail if the message is too long, removing the list of
ids can prevent that from happening.
Celery's apply_async function accepts 'kwargs' as (get ready to be
confused) either a positional argument, or a keyword argument:
Positional: apply_async(['args'], {'kw': 'args'})
Keyword: apply_async(args=['args'], kwargs={'kw': 'args'})
We rely on the positional form in at least one place [1]. This fixes
the overload of apply_async to cope with both forms, and continue to
pass through any other (confusion time again) keyword args to super(),
such as queue="queue".
Note that we've also decided to stop accepting other positional args,
since this is unnecessarily confusing, and we don't currently rely on
it in our code. This stops it creeping in in future.
[1]: fde927e00e/app/job/rest.py (L186)
We only want to send a broadcast if the broadcast message is not stubbed
and the service is live at the point at which the broadcast event should
be created. This is to prevent the situation where a broadcast service is
switched to live / trial mode in between the message being created and
approved (we log an error if this happens).
A stubbed broadcast message with a trial mode service at the point of
approval is not an issue - trial mode services can approve their own
broadcasts. In this situation, we don't create the broadcast event but
also don't need to log an error.
If we're not going to send a broadcast, we don't need to create the
BroadcastEvent in the database. The BroadcastMessage contains all the
data we need - the BroadcastEvent is not used.
Not creating the event when we won't send the broadcast (e.g. when the
broadcast message was created when the service was in trial mode) adds
an extra layer of security.