This means we will have a much easier way of knowing what the settings
are for a broadcast service.
Note, we can just move data directly into the newer table as there is
nothing on the API or admin app that is putting data in the
`service_broadcast_provider_restriction` table, this was being done
manually for the few services that needed it.
This will allow us to store details of which channel a service should be
sending to.
See the comment about how all broadcast services can have a row in the
table but may not at the moment. This has been done for speed as it's
the quickest way to let us set up different services to send to
different channels for some needed testing with the mobile handset
providers in the coming week.
So:
'billing_contact_email_address' becomes 'billing_contact_email_addresses'
AND
'billing_contact_name' becomes 'billing_contact_names'
This is to signify that each of those fields can contain numerous
items
Flake8 Bugbear checks for some extra things that aren’t code style
errors, but are likely to introduce bugs or unexpected behaviour. A
good example is having mutable default function arguments, which get
shared between every call to the function and therefore mutating a value
in one place can unexpectedly cause it to change in another.
This commit enables all the extra warnings provided by Flake8 Bugbear,
except for:
- the line length one (because we already lint for that separately)
- B903 Data class should either be immutable or use `__slots__` because
this seems to false-positive on some of our custom exceptions
- B902 Invalid first argument 'cls' used for instance method because
some SQLAlchemy decorators (eg `declared_attr`) make things that
aren’t formally class methods take a class not an instance as their
first argument
It disables:
- _B306: BaseException.message is removed in Python 3_ because I think
our exceptions have a custom structure that means the `.message`
attribute is still present
Matches the work done in other repos:
- https://github.com/alphagov/notifications-admin/pull/3172/files
Add different error message for email and text if content is too long.
Use utils version with is_message_too_long method implemented for email templates.
SES rejects email messages bigger than 10485760 bytes (just over 10 MB per message (after base64 encoding)):
https://docs.aws.amazon.com/ses/latest/DeveloperGuide/quotas.html#limits-message
Base64 is apparently wasteful because we use just 64 different values per byte, whereas a byte can represent
256 different characters. That is, we use bytes (which are 8-bit words) as 6-bit words. There is
a waste of 2 bits for each 8 bits of transmission data. To send three bytes of information
(3 times 8 is 24 bits), you need to use four bytes (4 times 6 is again 24 bits). Thus the base64 version
of a file is 4/3 larger than it might be. So we use 33% more storage than we could.
https://lemire.me/blog/2019/01/30/what-is-the-space-overhead-of-base64-encoding/
That brings down our max safe size to 7.5 MB == 7500000 bytes before base64 encoding
But this is not the end! The message we send to SES is structured as follows:
"Message": {
'Subject': {
'Data': subject,
},
'Body': {'Text': {'Data': body}, 'Html': {'Data': html_body}}
},
Which means that we are sending the contents of email message twice in one request: once in plain text
and once with html tags. That means our plain text content needs to be much shorter to make sure we
fit within the limit, especially since HTML body can be much byte-heavier than plain text body.
Hence, we decided to put the limit at 1MB, which is equivalent of between 250 and 500 pages of text.
That's still an extremely long email, and should be sufficient for all normal use, while at the same
time giving us safe margin while sending the emails through Amazon SES.
depending on the notification type.
Up until now, only sms messages could get message-too-long error,
but now we also need to validate the size of email messages, so
the message content needs to be tailored to the notification type.
Reflects the new name of the feature.
Note that the name of the underlying table hasn’t changed because it’s
explicitly set to `service_whitelist`. Changing this will be a more
involved process.
At the moment we return a count of recent jobs for contact lists, where
recent is defined as being within the service’s data retention period.
This lets us write nice bits of UI copy like ‘used 3 times in the last
7 days’. But it’s hard to write the copy for when the count is 0,
because this could be for one of two reasons:
- the contact list has never been used
- the contact list has been used, but not within the data retention
period for that channel
At the moment we can’t know which of those reasons is the case, so we
can’t write nice clear content like ‘never been used’.
This commit adds a property to contact lists which says whether they’ve
ever been used.
It also renames the existing, as-yet-unused property to make clear that
it’s only counting within the data retention (so can still be 0 even if
`has_jobs` is `True`).
This was broken because sometimes `service.permissions` is a list of
strings (for when we’re caching the service object) and sometimes it’s a
list of permission objects (when we’re dealing with ORM objects).
Because the validator code is shared, the least-messy way to fix it is
to make sure it can handle both types.
It can’t just take a list of permissions as argument, because it uses
other fields on the service.
It would be messy to rewrite the endpoint to use a serialised service
because the tests all expect to be dealing with database objects, so it
would be a faff to change what they’re mocking.
By default Marshallow includes unknown properties. This means every time
a new property is added to the service model it gets included in the
JSON-serialised response sent to the admin app.
This is particuarly bad because it means that for returned letters the
ID of every returned letter. So the JSON stored in Redis for the
Check Your State Pension service is 86kb.
Similarly the JSON stored in Redis for a big user of inbound text
messaging is 458kb(!!!) because it has the ID of every received text
message. That’s ~8,500 UUIDs.
Luckily the admin app tells us exactly which keys it’s using here:
5952d9c26d/app/models/service.py (L31-L52)
```python
- `active`
- `contact_link`
- `email_branding`
- `email_from`
- `id`
- `inbound_api`
- `letter_branding`
- `letter_contact_block`
- `message_limit`
- `name`
- `prefix_sms`
- `research_mode`
- `service_callback_api`
- `volume_email`
- `volume_sms`
- `volume_letter`
- `consent_to_research`
- `count_as_live`
- `go_live_user`
- `go_live_at`
}
```
Plus these which it does not get automatically:
- `email_branding`
- `letter_branding`
- `organisation`
- `organisation_type`
- `permissions`
- `restricted`
The API is returning all of these:
- `active`
- `all_template_folders`
- `annual_billing`
- `consent_to_research`
- `contact_link`
- `contact_list`
- `count_as_live`
- `created_by`
- `crown`
- `email_branding`
- `email_from`
- `go_live_at`
- `go_live_user`
- `id`
- `inbound_api`
- `inbound_number`
- `inbound_sms`
- `letter_branding`
- `letter_contact_block`
- `letter_logo_filename`
- `message_limit`
- `name`
- `organisation`
- `organisation_type`
- `permissions`
- `prefix_sms`
- `rate_limit`
- `research_mode`
- `restricted`
- `returned_letters`
- `service_callback_api`
- `users`
- `version`
- `volume_email`
- `volume_letter`
- `volume_sms`
- `whitelist`
So the ones that the admin is getting but not expecting are:
- `all_template_folders`
- `annual_billing`
- `contact_list`
- `created_by`
- `crown`
- `inbound_number`
- `inbound_sms`
- `letter_logo_filename`
- `rate_limit`
- `returned_letters`
- `users`
- `version`
- `whitelist`
Which is what this PR adds to the exclude list, except for `created_by`
which is keeps because it’s needed to validate the JSON provided when
creating a service.
This copies what we do to a user's email address when archiving the user
by prefixing it with `_archived_{date}`. We already prefixed the
service name and email_reply_to with `_archived`, but this didn't allow
a service with the same name to be archived more than once.
Because we’ll be grouping jobs under their parent contact lists it will
be useful to have a way of showing how many times a contact list has
been used. This will give the right information scent to indicate that
clicking into a contact list is where you go to see its jobs. This means
that the API needs to return a count of jobs for each contact list.
Putting this code feels very non-idiomatic for our API. So suggestions
about how to better architect it welcome…
The standard way that we indicate that there are more results than can
be returned is by paginating. So even though we don’t intend to paginate
the search results in the admin app, it can still use the presence or
absence of a ‘next’ link to determine whether or not to show a message
about only showing the first 50 results.
Like we have search by email address or phone number, finding an
individual letter is a common task. At the moment users are having to
click through pages and pages of letters to find the one they’re looking
for.
We have to search in the `to` and `normalised_to` fields for now because
we’re not populating the `normalised_to` column for letters at the
moment.
So we keep a record of who first uploaded a list it’s better to archive
a list than completely delete it.
The list in the database doesn’t contain any recipient info so this
isn’t a change to what data we’re retaining.
This means updating the endpoints that get contact lists to exclude ones
that are archived.
Once a contact list is gone from the database there’s no way to
reference it again. Any jobs have made their own copy.
So we can clean it up, meaning we’re not storing personal data longer
than we need to.
This was one of things we de-scoped when we first shipped this feature.
In order to safely delete a list, we first need to make sure any jobs
aren’t referencing it.