* Removed all govuk css
* Updated reference files
* Removing govuk js
* Fixed casing for modules, removed unused page
* Got more reference images
* Updated template page
* Removed govuk padding util
* Updated hint to uswds hint
* More govuk cleanup
* Commiting backstopjs ref files
* Fixed all unit tests that broke due to brittleness around govuk styling
* Added new ref images
* Final removal of govuk
* Officially removed all govuk references
* Updated reference file
* Updated link to button
* UI modernization
* Cleanup
* removed govuk escaping tests since they are no longer needed
* Fix CodeQL security issue in escapeElementName function
- Escape backslashes first before other special characters
- Prevents potential double-escaping vulnerability
- Addresses CodeQL alert about improper string escaping
* Found more govuk removal. Fixed unit tests
* Add missing pipeline check to pre-commit
* updated test
* Updated e2e test
* More update to e2e test
* Fixed another e2e test
* Simple PR comments addressed
* More updates
* Updated backstop ref files
* Refactored folder selection for non-admins
* Updated redundant line
* Updated tests to include correct mocks
* Added more ref files
* Addressing carlos comments
* Addressing Carlo comments, cleanup of window initing
* More cleanup and addressing carlo comments
* Fixing a11 scan
* Fixed a few issues with javascript
* Fixed for pr
* Fixing e2e tests
* Tweaking e2e test
* Added more ref files and cleaned up urls.js
* Fixed bug with creating new template
* Removed brittle test - addressed code ql comment
* e2e race condition fix
* More e2e test fixes
* Updated e2e tests to not wait for text sent
* Updated test to not wait for button click response
* Made tear down more resilent if staging is down
* reverted e2e test to what was working before main merge
* Updated backstopRef images
* Updated gulp to include job-polling differently
This changeset removes webauthn from the Notify.gov admin app. We are not using webauthn at all in our implementation and will be looking at an entirely different authentication system in the near future.
Signed-off-by: Carlo Costino <carlo.costino@gsa.gov>
Daily volumes report: total volumes across the platform aggregated by whole business day (bst_date)
Volumes by service report: total volumes per service aggregated by the date range given.
NB: start and end dates are inclusive
This stops adding the current user to the data sent to the API when
removing a user from an organisation. API only needs to know the
organisation_id and the id of the user we want to remove from
the organisation, so we don't need to pass through the id of the
current user too.
The other change made is to clear the user cache of the user who has
been removed from the org. We don't need to clear any of the organisation
caches, since these values don't contain lists of users for the orgs.
It’s weird to have a method just for updating one attribute. I think
the reason for doing this was to only invalidate the
`organisation-{}-name` cache when absolutely necessary, but:
- we don’t need a separate method to check whether it’s the name being
updated
- it was easy to get around this by calling
`OrganisationsClient.update_organisation` directly, leaving a stale
value in the cache
We have a `client_request` fixture which does a bunch of useful stuff
like:
- checking the status code of the response
- returning a `BeautifulSoup` object
Lots of our tests still use an older fixture called `logged_in_client`.
This is not as good because:
- it returns a raw `Response` object
- doesn’t do the additional checks
- means our tests contain a lot of repetetive boilerplate like `page = BeautifulSoup(response.data.decode('utf-8'), 'html.parser')`
This commit converts all the tests using `logged_in_client` to:
use `client_request` instead.
This updates the `get_notifications_for_service` of the
`NotificationApiClient` to make POST request to the api when the `to`
field is provided. This is done so that we avoid logging personal
details such as email addreses.
If the `to` field is not present, the method will make a GET reqeust and
there will be no change to how it works.
This stat is shown on a few of our pages, such as our homepage,
the performance page and also a platform admin page
and is currently catched for the
default TTL of 1 week. I think there is no reason we can't make
this only cache once an hour and give slightly more up to date stats
which will update more regularly.
This mimics the approach and also the TTL choice of 1 hour that has
been added for the performance page (although there is no
particular bug here to fix, it is just nice to have slightly more up
up to date data).
Note, the API call only takes about 0.3 seconds at the moment
so it is not particularly intensive on the DB to run this more
regularly.
I came across a bug where the performance page might be visited
on say the 22nd and we expect it to show data for yesterday and
the past 7 days (so the 21st and back 7 days) but instead it
only shows it for the 20th and then back 6 days.
The cause is that we have nightly tasks that run to
calculate the number of notifications sent per service (the
ft_notification_status table)
calculate the number of notifications sent within 10 seconds
(the ft_processing_time table)
Both of these tables get filled between the hours of midnight and
2:30am. The first seems to be filled by about 12:30 every night
whereas the processing stats seems to be filled about 2am every
night.
The admin app currently caches the results of the call to the API
(https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L24)
to get this data with the key
performance-stats-{start_date}-to-{end_date}.
Now the issue here is that if the performance page is visited at
00:01 on the 23rd, it will call the API for data on the 22nd and
backwards 7 days. The data it gets at 00:01 will not yet have the
data for the 22nd, it will only go as far as the 21st, because
the overnight jobs to put data in the tables have not run yet. The
admin app then caches this data, meaning that for the rest of the
day the page will be missing data for the 22nd, even though it
becomes available after approx 2am. We have seen it a few times
in production where the performance page has indeed been visited
before say 2 am, leaving the page with missing data.
In terms of solutions, there are several potential options.
1. Remove the caching that we do in redis. The downside of this is
that we will hit the API more (currently once a day for the first
visitor to the performance page). We have had 255 requests
to the performance page in the last week and when there is no value
in redis, the call to the API takes approximately 1-2 seconds.
Removing the caching would against why the caching was originally
added which was to act as a basic protection against malicious
DOS attempts.
2. Make the admin app only set the cache value if it is after
2:30am. This one feels a bit gross and snowflakey.
3. The API flushes the redis cache after it has finished its nightly
tasks. We don't have that much precedent of the API interacting
with redis, it is mostly the admin app that does but this is still
a valid option.
4. Cache in redis the data for 2 days ago to 7 days ago and then
get the data for yesterday freshly every time. This would mean
some things are still cached but the smallest bit that we can't rely
on can be gathered fresh.
5. Remove the caching we do in redis but then try and optimise the
API endpoint to return results even faster and with less
interaction with the DB. My hypothesis is that the slowest part
is where the API has to calculate how many notifications Notify
has ever sent
(https://github.com/alphagov/notifications-api/blob/master/app/performance_dashboard/rest.py#L36)
and so we could potentially try and store these totals in the DB
every night and update them nightly rather than having to query all
the data in the table to figure them out every time.
6. Reduce the expiry time of the cache key so although the bug will
still exist, it will only exist for a much shorter time.
In the end, we've gone for option 6. The user impact of
this bug sticking around for an hour is minimal, in particular
because that would be in the early hours of the morning. The solution
is also quick to implement and keeps the protection against a DOS attack.
The value of 1 hour for expiry was a fairly abitrary choice, it could
have been as little as 1 minute or as much as 6 hours.
When the broadcast service settings form is submitted it now removes all
permissions for users in notifications-api. This means it should be
clearing the user cache.
the js `fetch` function will follow redirects blindly and return you the
final 200 response. when there's an error, we don't want to go anywhere,
and we want to use the flask `flash` functionality to pop up an error
page (the likely reason for seeing this is using a yubikey that isn't
associated with your user). using `flash` and then
`window.location.reload()` handles this fine.
However, when the user does log in succesfully we need to properly log
them in - this includes:
* checking their account isn't over the max login count
* resetting failed login count to 0 if not
* setting a new session id in the database (so other browser windows are
logged out)
* checking if they need to revalidate their email access (every 90 days)
* clearing old user out of the cache
This code all happens in the ajax function rather than being in a
separate redirect, so that you can't just navigate to the login flow. I
wasn't able to unit test that function due how it uses the session and
other flask globals, so moved the auth into its own function so it's
easy to stub out all that CBOR nonsense.
TODO: We still need to pass any `next` URLs through the chain from login
page all the way through the javascript AJAX calls and redirects to the
log_in_user function