This works in conjunction with the new SMS provider stub [^1].
Local testing:
- Run the migrations to add Reach as an inactive provider.
- Activate the Reach provider locally and deactivate the others.
update provider_details set priority = 100, active = false where notification_type = 'sms';
update provider_details set active = true where identifier = 'reach';
- Tweak your local environment to point at the SMS stub.
export REACH_URL="http://host.docker.internal:6300/reach"
- Start / restart Celery to pick up the config change.
- Send a SMS via the Admin app and see the stub log it.
- Reset your environment so you can send normal SMS.
update provider_details set active = true where notification_type = 'sms';
update provider_details set active = false where identifier = 'reach';
[^1]: https://github.com/alphagov/notifications-sms-provider-stub/pull/10
This avoids duplicating it as we add a new provider and means we
can test it all in one place (although it wasn't tested before).
I'm not sure why the previous code did "super(..)__init__" in a
non-init function - it's a bit late! - so I've just replaced it
with a call to the new "init_app" function in the parent class.
The provider tests are coupled to actual data in the DB, but we
shouldn't have to overhaul the tests when this changes.
Assuming we don't delete old providers, just testing a subset of
the fixture data should give us enough confidence in the code.
This reduces the code to copy when we add a new provider. I don't
think we need to log the URL or status code each time:
- The URL is always the same.
- A "200" status code is implicit in "success".
- Other status codes will be reported as exceptions.
Removing these specific elements means "record_outcome" is generic
and can be de-duplicated in the base class.
This is never overridden and can't be used in practie because all
SMS clients have to use the same interface. Removing it will make
it possible to DRY-up some of the code in this method.
Previously we used a combination of "provider.name" and "get_name()"
which was confusing. Using a non-property function also gave me the
impression that the name was more dynamic than it actually is.
This is enough to update a notification in DB:
1. First create a notification in the UI and sent it.
2. Then reset its attributes to pretend it's for Reach.
update notifications set
sent_at = null,
sent_by = null,
notification_status='sending'
where id='some-uuid';
3. Change "notification_id" to "<some-uuid>" in the code.
4. Call the boilerplate endpoint for Reach callbacks.
curl -X POST localhost:6011/notifications/sms/reach
Interestingly there's no foreign key constraint on "sent_by" in the
DB, so this just works: the notification is updated.
These were hard to find by searching as the filename doesn't refer
to SMS or letters. Ideally our tests should be organised to reflect
the structure of files in app/, so this does that.
Fixes:
> reduced_provider = providers[identifier]
E KeyError: 'firetext'
Note that the mock return value in the other test was wrong [^1].
[^1]: bff97f0bbe/app/dao/provider_details_dao.py (L73)
Since we moved to Dockerised Celery the pycurl guidance should no
longer be necessary and doesn't work on newer Mac M1 machines.
This also adds new guidance for psycopg2, which will hopefully be
temporary (see issue comment).
there's not anything we know we need to do now that we resolve stuck
letters automatically. Letters couuld still get into this state, so it's
worth alerting us. However, we don't have anything concrete that we know
how to fix these letters, so we should just remove the runbook entirely.
Currently "test_send_letter_notification_via_api" fails at the final
stage in create-fake-letter-response-file [^1]:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=6011): Max retries exceeded with url: /notifications/letter/dvla (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xffff95ffc460>: Failed to establish a new connection: [Errno 111] Connection refused'))
This only applies when running in Docker so the default should still
be "localhost" for the Flask app itself.
[^1]: 5093064533/app/celery/research_mode_tasks.py (L57)
It is currently 60 seconds but we have had two incidents in the
past week where there is a connection error talking to a service
and the request takes up to 60 seconds before failing. When this
happens, if there are a few of these callbacks then all of them
will completely hog the service callback worker and build up a big
queue of all the other service callbacks.
5 seconds has been chosen as that is still a pretty decent length
time for a simple web request that should just be giving them a
little bit of information for them to store. 5 seconds should be a
sufficient enough reduction that we dramatically reduce this problem
for the moment.
Open to this number
being changed in the future based on how we see it perform.
Since sept 2019 we've had to log on to production around once every
twenty days to restart the virus scan task for a letter. Most of the
time this is just a case of making sure the file is in the scan bucket,
and then triggering the task. If the file isn't in the scan bucket we'd
need to do some more manual investigation to find out exactly where the
file got stuck, but I can only remember times when it's been in the scan
bucket.
So if the file is in the scan bucket, we can just check that with code
and kick the task off automatically.
This follows the pattern for invite emails where the admin app tells the
API which domain to use when generating the link.
This will starting working once the admin change is merged:
- [ ] TBC
It won’t break anything if it’s merged before the admin change.
Daily volumes report: total volumes across the platform aggregated by whole business day (bst_date)
Volumes by service report: total volumes per service aggregated by the date range given.
NB: start and end dates are inclusive