As of 041d8b48a2
it’s not valid to call `random.choices` without giving at least one of
the options a positive weighting.
This makes sense, because giving a zero weighting is effectively saying
‘theres’s only one choice, but don’t choose it’.
In our codebase this is applicable where there’s only one international
provider, which we want to use even when it’s been de-prioritised for
domestic SMS.
This doesn’t cause a problem now, but will if we upgrade to Python
versions greater than 3.9.0.
Broadcasts created via the API [1] and the Admin app [2] should
both now have this field set. It's also more informative to show
this, and broadcasts created via the API don't have IDs anyway.
There's a small risk that an old broadcast that gets approved won't
have this data, but it's for information only and we intend to
backfill all old broadcasts in the near future.
[1]: 023a06d5fb
[2]: 7dbe3afa19
In one case ("areas=['manchester']") the format was even invalid,
but in general the original value of the column is pretty much
irrelevant for tests that involve updating it (it's highly unlikely
the column would default to the same value as the test data).
For the public API we actually receive a "name" instead of an ID,
which we also want to start sending from the Admin app.
Unlike IDs, which aren't really used anywhere, we want the names
to display the alerts on gov.uk/alerts.
This is necessary until:
- The Admin app is using the new "areas(_2)" format to store and
retrieve data.
- We've migrated all existing broadcast messages to use the new
format.
Note that "areas" / "ids" isn't actually used for anything except
printing out the PagerDuty message - it's not sent to the proxy [1].
[1]: 6edc6c70aa/app/celery/broadcast_message_tasks.py (L190-L193)
Currently we have:
- An "areas" column in the DB that stores a JSON blob.
- An "areas" field inside the "areas" JSON that stores area IDs.
- Each field has to be manually copied into the JSON column.
We want to move to:
- An "areas" column in the DB (unchanged).
- An "ids" field inside the "areas" JSON (to replace "areas").
- The Admin app sending other data inside an "areas" JSON field.
The API design for areas is confusing and difficult to extend.
Here we duplicate the current API functionality using an "areas_2"
field. Once the Admin app is using this field, we'll be able to
rename it to just "areas", which is where we want to get to.
In the next commits we'll build on this to support the migration
from "areas"."areas" to "areas"."ids".
This is a temporary feature to make it easy to migrate the format
of the "areas" column and backfill extra data for it.
It's not possible to use this feature to update the status of an
old broadcast message, so the risk from this override is minimal.
If a polygon is smaller than the largest polygon in our dataset of
simplified polygons then we’re only throwing away useful detail by
simplifying it.
We should still simplify larger polygons as a fallback, to avoid sending
anything to the CBC that we’re not sure it will like.
The thresholds here are low: we can raise them as we test and experiment
more.
Here’s some data about the Flood Warning Service polygons
Percentile | 80% | 90% | 95% | 98% | 99% | 99.9%
-----------|-----|-------|--------|---------|---------|---------
Point count| 226 | 401.9 | 640.45 | 1015.38 | 1389.07 | 3008.609
Percentile | 80% | 90% | 95% | 98% | 99% | 99.9%
--------------|-----|-------|--------|---------|---------|---------
Polygon count |2----|3------|5-------|8--------|10-------|40.469
This new version of utils implements the transformation of our polygons
to a Cartesian plane. In other words, it converts them from being
defined in spherical degrees to metres.
For the API this means our simplification will be slightly more
accurate.
Regardless of channel.
Do not include:
- broadcasts older than 25.05.2021
- stubbed broadcasts
- broadcasts that were not transmitted. So only broadcasting,
cancelled and completed make the list;
This is the original behaviour [1]. Since all internal requests will
have corresponding logs from public-facing apps that are making them,
there's little value in logging them.
Logging internal requests doesn't lead to a significant increase in
our overall log ingestion: a rough estimate is its an extra 5000 logs
per minute, out of about 900K per minute.
[1]: e08d726f05/app/authentication/auth.py (L153)
The rest of the tests need to construct the header directly so they
can pass custom tokens. But for the three tests that actually make
a request to prove the auth functions work as wrappers, we can use
the same factory functions we use everywhere else in the tests.
While "key" is the term used by the JWT library, all the rest of
our code - the ApiKey model, the Python client - all use the term
"secret" instead. Although "secret" is less precise than "key", it
does help avoid confusion with (api) key (as a model object).
We can define the API properly in future work. I've used a separate
blueprint from "broadcasts" since this API is purely internal, and
it's helpful to make it clear it's specific to govuk-alerts.
This adds previously missing tests and changes the existing ones
to test the "requires_internal_auth" function directly. In order
to make the tests generic, we have a fake auth function and an
associated fixture.
Having generic tests for internal auth will make it easier to add
other "requires..." functions in future.
This makes the tests easier to read by avoiding request boilerplate
and making a clearer link between the name of the test and that we
are actually testing that specific function.
This is a lot of code to check we haven't written a single line,
which we can just visually see isn't there. We should avoid having
tests that check code _isn't_ there, as such testing is infinite.
We can simplify the rest of the tests to avoid the boilerplate of
making an actual request. But it's worth keeping these two to prove
the wrapper work correctly for an arbitrary route.
Previously we had a lot of duplicate tests inconsistently checking
each of the "requires_" functions. Since both of them now use the
same "_decode_jwt_token" helper, we can consolidate all the tests
onto that. In future commits we'll look at testing the top-level
functions in terms of what they do specifically.
This switches to testing the two functions directly as trying to
test them through the top-level "requires_..." functions or calls
to endpoints doesn't scale as we add more of them.
While this has a slight risk that a "requires_..." function might
not be using these helpers, it seems unlikely and we can always
add a mock to check this if we're concerned in future.
Previously "requires_auth" and "requires_admin_auth" had similar
but different ways of checking their keys. This switches them to
use the same checks, with the admin / internal auth passing in a
fake / stub set of "api keys" to check.
Pulling out the logic this way will make it easier to unpick the
tests, so we can focus on testing what's unique to each kind of
API auth and avoid future duplication when we start calling the
"requires_internal_auth" method with other client_ids.
Note that a couple of error messages / response codes have changed
for admin / internal auth. None of these occur in practice, so we
can make them consistent with the behaviour for the public API.
Previously this was heavily duplicated but with the odd test using
a __create_token method. This adds some fixtures to remove all the
boilerplate and standardise how tokens are created in each test.