notifications-admin

mirror of https://github.com/GSA/notifications-admin.git synced 2026-06-29 03:43:09 -04:00

Author	SHA1	Message	Date
Ben Thorner	d2784d0d8a	Rename "parents" methods to "ancestors" Resolves: https://github.com/alphagov/notifications-admin/pull/3980#discussion_r694002952 A grandparent is not a parent, so the return value of these methods were misleading. This makes it clearer.	2021-08-23 16:50:18 +01:00
Ben Thorner	1923c5edb1	Remove redundant 'filter' and return value 'None' is the implicit return value. Since the filter was operating on a yield that never yield 'None', it was redundant.	2021-08-23 16:35:38 +01:00
Chris Hill-Scott	b273037462	Use str.join to build query This avoids the nasty slice operator to trim the trailing comma.	2021-08-06 13:28:41 +01:00
Chris Hill-Scott	de364bba3c	Make `overlapping_areas` a cached property It’s quite expensive to calculate and there’s no guarantee we’ll only use it once.	2021-08-06 13:28:41 +01:00
Chris Hill-Scott	5e1b96a3a7	Remove argument unpacking from `get_areas` Making it only callable in one way is just less stuff to understand.	2021-08-06 13:28:40 +01:00
Chris Hill-Scott	775954da9d	Avoid doing a single SQL query per overlapping area To count phones in a custom polygon we need to work out the percentage of overlap with each known area. This means we need to get each known area from the database to compare it. At the moment we do this by running: - one SQLite query to get the details of all matching areas - a loop, which performs one SQLite query per area to get the polygons This commit reduces the number of SQLite queries to one, which uses a `JOIN` to get both the details of the areas and their polygons. This gives a speed increase of about 25% for a big area like Lincolnshire.	2021-08-06 13:28:40 +01:00
Chris Hill-Scott	e7ec77c5bb	Make calculating overlapping areas faster By using the simplified polygons instead of the full resolutions ones we: - query less data from SQLite - pass less data around - give Shapely a less complicated shape to do its calculations on This makes it faster to calculate how much of each electoral ward a custom area overlaps. For the two areas in our tests: Place represented by custom area \| Before \| After ---------------------------------\|--------\|-------- Bristol \| 0.07s \| 0.02s Skye \| 0.02s \| 0.01s	2021-08-06 13:28:40 +01:00
Ben Thorner	297ab3e5ae	Rename demo area to match govuk-alerts Relates to: https://github.com/alphagov/notifications-govuk-alerts/pull/152 I ran the "create-broadcast-areas-db.py" script to regenerate the Sqlite DB. Existing alerts with the old naming still appear correctly, and since we don't (yet) store this text in the DB, there's nothing more to update.	2021-08-02 15:34:55 +01:00
Chris Hill-Scott	a766324559	Make the max polygon point count a constant And document it in context.	2021-07-06 17:00:51 +01:00
Chris Hill-Scott	e4ca78634d	Bump utils to bring in new polygon simplification We’ve changed our simplification a bit so: - polygons have slightly more points (see https://github.com/alphagov/notifications-utils/pull/873) - the individual points have less precision (see https://github.com/alphagov/notifications-utils/pull/872) Overall this reduces the size of the data we’re storing from 74MB to 63MB, and should make any pages where we are rendering lots of coordinates load a bit quicker.	2021-07-06 17:00:50 +01:00
Chris Hill-Scott	5a378fe51f	Use CustomBroadcastArea to estimate phones in bleed area Our current assumption is that the bleed area has the same population density as the broadcast area. This is particularly naïve when: - the bleed area overlaps the sea – no-one lives in the sea - the broadcast area is a village and the bleed area is the surrounding countryside - the broadcast area is adjacent to a densely populated area like a city We can be smarter about this now that we have a way of determining the number of phones in an arbitrary area, based on the known areas that we have population data about. Calculating the population in an overlap is a slightly more intensive calculation. So we only doing it for areas which are smaller enough that it doesn’t slow things down too much. For larger areas we still use the more naïve algorithm.	2021-07-02 10:36:25 +01:00
Chris Hill-Scott	b47d04fbf6	Check that the simplification process hasn’t introduced bad data This is a good bit of future proofing against unintended mistakes in the simplification code.	2021-06-24 18:28:33 +01:00
Chris Hill-Scott	72cdad14d9	Run app/broadcast_areas/create-broadcast-areas-db.py	2021-06-24 18:28:33 +01:00
Chris Hill-Scott	779ac74fc7	Manually remove a coordinate from Bathavon South This is the only way I can think to stop this shape self-intersecting without drastically changing its area (i.e. filling the hole in the donut). This is the only area in our library which is a genuine donut and presents this problem	2021-06-24 18:28:21 +01:00
Chris Hill-Scott	62a2c524ab	Fix invalid polygons while importing geographic data Some of the polygons in our source data are invalid. An invalid polygon is one that self intersects, in other words has a point which causes the boundary of the shape to cross itself. This doesn’t cause an exception until we try to perform certain operations on one of these polygons, like intersecting them with another polygon. This is why we haven’t spotted that they are invalid until now. This commit adds checks so that as we import the polygons we make sure they are valid. If they are not valid, we can automatically fix them by just looking at the exterior boundary of the shape, and ignore any holes created by self intersection.	2021-06-24 18:10:50 +01:00
Ben Thorner	fba8d09875	Move broadcast model code into an explicit module Previously this was hidden away in an anonymous __init__.py file. I did think about splitting the models into individual files, like we do with the top-level models for the app. Since the models are only imported in one place - i.e. are all used together - it didn't seem worth the hassle, so I've kept them in one file.	2021-06-10 15:05:38 +01:00
Chris Hill-Scott	c9611e1cf7	Add another area to the library of test polygons	2021-05-10 16:09:02 +01:00
sakisv	bfa8dfe95e	Fix import order	2021-04-13 16:31:06 +03:00
Chris Hill-Scott	e7aad61220	Use pure Python Rtree library The Python rtree library we are using to build RTrees has a dependency on the C package libspatialindex. This package is not installed on PaaS, so it’s hard for us to use it. This commit changes the code to use a library called rtreelib instead. rtreelib doesn’t have a built in way to serialise the index it builds, so I’ve had to implement that using pickle.	2021-04-13 12:43:28 +01:00
Chris Hill-Scott	83c521915c	Estimate number of phones in an arbitrary polygon We want to know how many phones are in a user-supplied polygon, so we can show the impact of a broadcast, in the same way that we do when users pick areas from our library. We already know how many phones are in each electoral ward. But there are challenges with an arbitrary polygon: - where it does overlap a ward, the overlap could be partial - it could overlap more than one ward - finding out which wards it overlaps by brute force (looping through all the wards and seeing which ones intersect with our polygon) would be way to slow to do in real time Instead we can use a data structure called an R-tree[1] to build an index which provides a much, much faster way of looking up which polygons overlap another. We can build this tree in advance and save it somewhere, which means there’s a lot of computation we don’t need to do in real time. The R-tree returns a set of objects (ward IDs) which we can go and look up in our library of electoral wards. These wards will be the ones that might have some overlap with our custom polygon. Once we have this small set of wards which might overlap our ward, we can look at the size of the area of overlap (relative to the size of the whole ward) and multiply that by the known count of phones in that ward to get an approximation of the count of phones in the overlap area. Summing these approximations give an estimate for the whole area of the custom polygon. 1. https://en.wikipedia.org/wiki/R-tree	2021-04-12 15:45:48 +01:00
Richard Baker	02600d76bd	Create additional non-UK broadcast test polygons This allows MNOs to test delivery to multiple non-adjacent cells without risk of sending a broadcast on the public network. This will also support testing of multiple polygon geometries in a single message. Test polygons are all non-UK (northern Finland). Signed-off-by: Richard Baker <richard.baker@digital.cabinet-office.gov.uk>	2021-03-31 10:00:39 +01:00
Chris Hill-Scott	fc75d60f65	Refactor BroadcastAreas to reuse common methods This commit makes an abstract base class for broadcast areas, so that methods and properties which are common between `BroadcastArea`s (those which come from our library) and `CustomBroadcastArea`s (those supplied via the API) can be shared.	2021-03-22 11:07:43 +00:00
Chris Hill-Scott	57aa994ce9	Add docstring	2021-03-19 15:47:18 +00:00
Chris Hill-Scott	a74db6eaa7	Handle areas which don’t have population data If an area has a `count_of_phones` value of `0` it means we don’t have data about the population. This means we can’t do the maths to work out the estimated bleed. So we should return the default amount of bleed of 1,500m instead, which is something in between what we’d expect for a built up area and a rural area.	2021-03-19 15:47:18 +00:00
Chris Hill-Scott	4367908269	Add limits to max/min bleed This prevents us from giving unrealistically large or small bleed estimates in case we have areas which are more dense or less dense than the most/least dense areas we currently have. Also means we don’t have to treat City of London as a special case.	2021-03-19 15:47:18 +00:00
Chris Hill-Scott	738ac1d818	Vary bleed amount based on population density There are basically two kinds of 4G masts: Frequency \| Range \| Bandwidth ----------\|-------------\|---------------------------------- 800MHz \| Long (500m) \| Low (can handle a bit of traffic) 1800Mhz \| Short (5km) \| High (can handle lots of traffic) The 1800Mhz masts are better in terms of how much traffic they can handle and how fast a connection they provide. But because they have quite short range, it’s only economical to install them in very built up areas†. In more rural areas the 800MHz masts are better because they cover a wider area, and have enough bandwidth for the lower population density. The net effect of this is that cell broadcasts in rural areas are likely to bleed further, because the masts they are being broadcast from are less precise. We can use population density as a proxy for how likely it is to be covered by 1800Mhz masts, and therefore how much bleed we should expect. So this commit varies the amount of bleed shown based on the population density. I came up with the formula based on 3 fixed points: - The most remote areas (for example the Scottish Highlands) should have the highest average bleed, estimated at 5km - An town, like Crewe, should have about the same bleed as we were estimating before (1.5km) – Pete D thinks this is about right based on his knowledge of the area around his office in Crewe - The most built up areas, like London boroughs, could have as little as 500m of bleed Based on these three figures I came up with the following formula, which roughly gives the right bleed distance (`b`) for each of their population densities (`d`): ``` b = 5900 - (log10(d) × 1_250) ``` Plotted on a curve it looks like this: This is based on averages – remember that the UI shows where is _likely_ to receive the alert, based on bleed, not where it’s _possible_ to receive the alert. Here’s what it looks like on the map: --- †There are some additional subtleties which make this not strictly true: - The 800Mhz masts are also used in built up areas to fill in the gaps between the areas covered by the 1800Mhz masts - Switching between masts is inefficient, so if you’re moving fast through a built up area (for example on a train) your phone will only use the 800MHz masts so that you have to handoff from one mast to another less often	2021-03-18 09:37:23 +00:00
David McDonald	3e80ba4734	Fix flake8 and isort errors Note, isort now has default behaviour of searching recursively so we no longer need the `-rc` flag	2021-03-08 18:48:56 +00:00
Chris Hill-Scott	f55a8bf4b8	Add library of test areas This is a temporary addition so we can test out some functionality.	2021-02-19 11:35:51 +00:00
Chris Hill-Scott	769b85ff25	Replace polygons module with the one from utils We moved it in https://github.com/alphagov/notifications-utils/pull/818/files	2021-02-12 14:52:53 +00:00
Chris Hill-Scott	60aa2d2b42	Display areas that aren’t in the library	2021-01-26 10:49:47 +00:00
Chris Hill-Scott	76f83f7d2a	Merge pull request #3652 from alphagov/updated-bristol-boundaries Update local authority district GeoJSON to bring in fixes for Bristol	2020-09-29 13:32:32 +01:00
Chris Hill-Scott	04e53c72b3	Update shapes to bring in fixes for Bristol I emailed the Geography team at the ONS: > Hi geography team, > > I work on GOV.UK Notify, which is a service run by Government Digital Service (part of the Cabinet Office). I was given your email address by [redacted] who’s been helping answer some of my questions on the cross-government Slack. > > We’re using some of the boundary datasets from the Open Geography Portal, and mostly they’ve been excellent. > > In the abstract, the problem we’re trying to solve is, given a point outside an area, what is the minimum distance to a point within that area. So, for example, if a crow was somewhere in Cardiff, what’s the shortest distance it would have to fly to reach somewhere in the Bristol local authority district? > > We’ve noticed some problems with the data that means our calculations would be wrong. We’ve noticed this around Torquay, Norwich and Bristol. Here are some screenshots of Bristol, from the generalised and full resolution boundaries: > > The artefacts I’ve highlighted are closer to Cardiff than any actual part of the land area of Bristol. They are either: > - in the sea > - land that’s part of North Somerset > > I suspect that this is being caused by the process of clipping the actual region of Bristol (which, unusually, extends into the water) to the mean high water line. > > I’ve worked around this by filtering out any polygons that are smaller than ~7,500m². It’s a bit hacky because parts of the Scilly Isles start disappearing. That’s not a problem for what I’m working on, but it would be nice to not need the hack. > > So my questions would be: > > - Is there a better way to remove these artefacts than filtering by area? > - Is there a plan to remove these artefacts from the data in future releases? > > Thanks in advance, > Chris They emailed back to say: > Hi Chris > > Thank you for your enquiry. > > We have completed the amendments to the LAD MAY 2020 BFC and BGC boundaries as mentioned so you should be able to download them from the portal now. > > Hope this helps. > > Kind regards > [redacted] This commit brings in the files they’ve updated. We still have to do some filtering (but now at a higher resolution) because they haven’t fixed Norwich yet. I’ll email them separately about that.	2020-09-25 12:24:23 +01:00
Chris Hill-Scott	e7169ad902	Add instructions for converting Shapefiles	2020-09-24 13:19:27 +01:00
Chris Hill-Scott	f50ef84c0d	Suggest previously-used areas when adding new area If you’re adding another area to your broadcast it’s likely to be close to one of the areas you’ve already added. But we make you start by choosing a library, then you have to find the local authority again from the long list. This is clunky, and it interrupts the task the user is trying to complete. We thought about redirecting you somewhere deep into the hierarchy, perhaps by sending you to either: - the parent of the last area you’d chosen - the common ancestor of all the areas you’d chosen This approach would however mean you’d need a way to navigate back up the hierarchy if we’d dropped you in the wrong place. And we don’t have a pattern for that at the moment. So instead this commit adds some ‘shortcuts’ to the chose library page, giving you a choice of all the parents of the areas you’ve currently selected. In most cases this will be one (unitary authority) or two (county and district) choices, but it will scale to adding areas from multiple different authorities. It does mean an extra click compared to the redirect approach, but this is still fewer, easier clicks compared to now. This meant a couple of under-the-hood changes: - making `BroadcastArea`s hashable so it’s possible to do `set([BroadcastArea(…), BroadcastArea(…), BroadcastArea(…)])` - making `BroadcastArea`s aware of which library they live in, so we can link to the correct _Choose area_ page	2020-09-22 17:33:04 +01:00
Chris Hill-Scott	dd8ce7d5bd	Merge pull request #3631 from alphagov/delete-plot-areas Delete plot-areas.py	2020-09-17 11:41:16 +01:00
Chris Hill-Scott	8a413bec91	Merge pull request #3617 from alphagov/population-estimates Give estimates of the number of phones in a broadcast area	2020-09-17 11:41:00 +01:00
Chris Hill-Scott	76244d8c07	Handle areas with missing data At the moment there are some areas which have: - a `count_of_phones` value of `None` - no sub-areas This is wrong, but until we fix the data the phone counting code needs to handle this. This commit: - adds the `or 0` in the right place (where it will catch these areas with missing data) - adds a test which checks these areas, and compares them to other kinds of areas	2020-09-17 11:02:22 +01:00
Chris Hill-Scott	49195cb0d3	Rename constants to populations This is a better name for the module because it’s: - not just constants, there’s a method in here now - only stuff to do with populations, not other kinds of constants	2020-09-16 14:45:45 +01:00
Chris Hill-Scott	3047af2c13	Refactor to make testing easier	2020-09-16 11:33:57 +01:00
Chris Hill-Scott	b9f75218d1	Add tests to ensure all areas have a count	2020-09-16 11:20:22 +01:00
Chris Hill-Scott	6b3fe3c5c5	Delete plot-areas.py We don’t need this now that the admin app can show areas while running locally.	2020-09-16 09:11:01 +01:00
Chris Hill-Scott	ce35200453	Rename variable to be clearer Better name than `population`, and `smartphone_ownership_for_area_by_age_range` matches with `SMARTPHONE_OWNERSHIP_BY_AGE_RANGE`	2020-09-16 08:46:59 +01:00
Leo Hemsted	c2e737b323	Merge pull request #3618 from alphagov/fix-broadcast-area-count generate library summary in python	2020-09-14 16:47:36 +01:00
Chris Hill-Scott	8ea3f0141c	Give estimates of the number of phones in a broadcast area We need to give people a better feel for the consequences of broadcasting an alert. We’ve seen in research that some users will assume it is subscription based, or opt-in, rather than going to every phone in the area. I reckon that the most effective way to communicate this is to put some numbers next to the areas, to give people an idea of how many people will get alerted. We can estimate how many phones are in an area by: - taking the population of all electoral wards in that area - multiplying it by the percentage of people who own an internet connected phone[1] The Office for National Statistics publish both these datasets. The number of people who own an intenet connected phone varies a lot by age. Since the population data for each ward is broken down by age we can factor this in. Simplified, the calculation looks like this: - take the _Abbey_ ward of _Barking and Dagenham_ - in this ward there are 26 people aged 80 - 40% of people over 65 have an internet-connected phone - therefore 10 of these 80-year-olds would be likely to receive a broadcast - (repeat for all other ages) These numbers won’t be exact, but should be enough to give people a feel for the severity of what they’re about to do. We can see if they acheive this aim in user research. 1. This is a proxy for the number of people who are likely to have a 4G capable phone, because only 4G capable phones will be receiving broadcasts to begin with	2020-09-14 16:26:09 +01:00
Leo Hemsted	ef0564f046	generate library summary in python much simpler than sqlite. also remove oxford commas Co-authored-by: Chris Hill-Scott <me@quis.cc>	2020-09-14 15:25:04 +01:00
Chris Hill-Scott	858d1ee197	Increase threshold for minimum polygon size We filter out very small polygons from the original data to remove glitches. These glitches are caused by trying to subtract the water from a polygon that includes some land and some water, but using two different definitions or resolutions of mean high water line. If we don’t do this then we end up with a bunch of very small polygons which lie far outside the understood area of a place, causing large overspill. We need to increase the threshold for this process because we’re still seeing this problem around Bristol and Norwich. This does mean we lose a few very small polygons in places like Shetland and the Scilly Isles, but not in such a way that we would avoid broadcasting to them (because they’d still be caught by the simplification and overspill).	2020-09-14 11:32:02 +01:00
Chris Hill-Scott	5e579ed45c	Merge pull request #3595 from alphagov/map-key Add a key to the map	2020-09-09 16:03:27 +01:00
Leo Hemsted	d654323eb8	remove unused fn	2020-09-09 14:39:13 +01:00
Leo Hemsted	9e132263d2	make tests pass (acknowledge that code is wrong) i really don't want to fix this right now but that total isn't quite right	2020-09-09 14:39:13 +01:00
Leo Hemsted	bc7d3710ab	make sure countries library still returns values to recap the previous commit, in the ward->local authority->county library we want to return all local authorities and counties. We do this by excluding anything that doesn't have children. However, in the countries library, all four countries don't have children. I can't think of a generic way to separate these so just filter on the library id	2020-09-09 14:39:13 +01:00

1 2

82 Commits