When we cloned the repository and started making modifications, we
didn't initially keep tests in step. This commit tries to get us to a
clean test run by skipping tests that are failing and removing some
that we no longer expect to use (MMG, Firetext), with the intention that
we will come back in future and update or remove them as appropriate.
To find all tests skipped, search for `@pytest.mark.skip(reason="Needs
updating for TTS:`. There will be a brief description of the work that
needs to be done to get them passing, if known. Delete that line to make
them run in a standard test run (`make test`).
* notify_db fixture creates the database connection and ensures the test
db exists and has migrations applied etc. It will run once per session
(test run).
* notify_db_session fixture runs after your test finishes and deletes
all non static (eg type table) data.
In unit tests that hit the database (ie: most of them), 99% of the time
we will need to use notify_db_session to ensure everything is reset. The
only time we don't need to use it is when we're querying things such as
"ensure get X works when database is empty". This is such a low
percentage of tests that it's easier for us to just use
notify_db_session every time, and ensure that all our tests run much
more consistently, at the cost of a small bit of performance when
running tests.
We used to use notify_db to access the session object for manually
adding, committing, etc. To dissuade usage of that fixture I've moved
that to the `notify_db_session`. I've then removed all uses of notify_db
that I could find in the codebase.
As a note, if you're writing a test that uses a `sample_x` fixture, all
of those fixtures rely on notify_db_session so you'll get the teardown
functionality for free. If you're just calling eg `create_x` db.py
functions, then you'll need to make you add notify_db_session fixture to
your test, even if you aren't manually accessing the session.
Previously we used a combination of "provider.name" and "get_name()"
which was confusing. Using a non-property function also gave me the
impression that the name was more dynamic than it actually is.
At the moment, when we are processing and sending an SMS we open
a DB connection at the start of the celery task and then close it
at the end of the celery task. Nice and simple.
However, during that celery task we make an HTTP call out to our
SMS providers. If our SMS providers have problems or response times
start to slow then it means we have an open DB connection sat waiting
for our SMS providers to respond which could take seconds. If our
SMS providers grind to a halt, this would cause all of the
celery tasks to hold on to their connections and we would run out
of DB connections and Notify would fall over.
We think we can solve this by closing the DB session which releases
the DB connection back to the pool.
Note, we've seen this happen in staging during load testing if our
SMS provider stub has fallen over. We've never seen it in production
and it may be less unlikely to happen as we are balancing traffic
across two providers and they generally have very good uptime.
One downside to be aware of is there could be a slight increase in
time spent to send an SMS as we will now spend a bit of extra time
closing the DB session and then reopening it again after the HTTP
request is done.
Note, there is no reason this approach couldn't be copied for our
email provider too if it appears successful.
As of 041d8b48a2
it’s not valid to call `random.choices` without giving at least one of
the options a positive weighting.
This makes sense, because giving a zero weighting is effectively saying
‘theres’s only one choice, but don’t choose it’.
In our codebase this is applicable where there’s only one international
provider, which we want to use even when it’s been de-prioritised for
domestic SMS.
This doesn’t cause a problem now, but will if we upgrade to Python
versions greater than 3.9.0.
We want to start using Firetext for sending international SMS. They
require us to use a different API key for international SMS because it
requires a new code path to switch the sender ID to something that the
country will accept.
This PR does not include switching the sender of international SMS to
Firetext but sets us up to do so.
Add caching by using the SeriralisedTemplate and SerialisedService objects
Removed extra call to the database to fetch the notification after the commit by saving the created_at and key_type to a local variable. After the update to the notification to mark it as sending the db.session is committed. Any reference to the the Notification data model after that will require a query to fetch the object again because it is considered "dirty" or out of date.
Added name, sms_prefix and email branding to SerialisedService.
Refactor the get_html_options to work with the SerialisedService object.
Removed the need to validate and format the to field by using `normalised_to`, since when persisting the notification the `normalised_to` field has already had this done.
Removed the validate and format for reply_to_text for email reply_to, this has been done when the email address has been added via the frontend, no need to validate this address every time a services sends an email.
By adding SerialisedTemplate we can avoid a database call for the template. This is useful when sending many many emails/sms for the same template/version.
Remove 2 extra select queries after the update and commit. Once a transaction is committed SQLAlchemy will query for the db model if referenced after a commit.
This means that anyone adding a new test to this file doesn’t have to
remember to clear the cache in their test, or forget to and have a
hard-to-debug test failure.
Using `setup_function` means we don’t have to convert this module into
using class-based tests.
For every email or text message we send we have to work out which
provider to send it with. Every time we do this we go and load the list
of providers from the database.
For emails, the result will always be the same.
For text messages the result is randomly chosen to balance the load
between the providers.
For international text messages the result is always the same (we only
have one international text message provider).
This commit adds an in-memory cache with a 2 second TTL so that we’re
not fetching the providers from the database every time, which should
speed things up a bit.
This does mean that, for text messages, the random choice will ‘stick’
for two seconds on each instance, before being re-chosen. I think this
is OK because it will even out to the same distribution over time.
I really don’t like having to clear the cache in the tests, so would
welcome suggestions on a better way of doing this…
i think it's causing havoc with my attempts to mock stuff in the
`app.clients` directory because it's also accessible at that path. the
name's super vague and doesn't explain what it is anyway
This prevents a race condition when we get delivery receipt before
updating notification to sending, and so the sending status would
supersede the delivered status, and the notification would time out
as temporary-failure after three days.
it's not acceptable for a constantly failing provider to take 50 minutes
to drain (5x reducing priority by 10). But similarly, we need _some_
delay, or a handful of concurrent failures will completely turn off a
provider, rendering the whole excercise kinda pointless. Setting the
delay before it tries to reduce priority again to one minute is nice
because it means that if one request times out and returns 502, then any
other requests that are in flight at that time will time out before the
one minute is up and not switch, but any requests made after the switch
that take sixty seconds to time out will affect it.
it now only needs to be used when you're:
* updating providers in ways that will create history (eg through
regular api calls)
* altering more than just priority in test setup (eg setting inactive,
deleting, or adding a provider)
Code that is within a `with Python.raises(...)` context manager but
comes after the line that raises the exception doesn't get evaluated.
We had some assertions that we never being tested because of this, so
this ensures that they will always get run and fixes them where
necessary.
- Change the NotificationTechnicalFailureException so that it only inherits from Exception.
- The notify_celery task should create the logging message on failure.
- Fix unit tests
- Remove named parameter when raising exception.
Utils 33.0.0 adds alt text to email branding - the HTMLEmailTemplate now
initializes slightly differently as a result (with both `branding_name`
and `branding_text`).
- Remove some redundant code for research mode.
- The international parameter in update_notification_to_sending is not needed.
- Update unit tests and removed duplicates
If we try to send an SMS to the provider and the provider throws an exception
(because they return a 503 status code) the notification should retry. But if
we get the callback from the provider before the notification has been retried, the
notification will have no billable units or provider set.
To avoid this, we now set billable_units and provider even if there has been
an exception from our provider.
‘GOV.UK’ doesn’t make sense as a type of brand. It only made sense as
a type of branding that a service had.
Since we’ve:
- deprecated the service branding column
- made sure it’s not used as a value in the email branding table
we can remove this value from the table of possible brand types.
https://www.pivotaltracker.com/story/show/159986276
We are now setting the type of branding on the branding itself, not on
the service.
This commit switches over from looking in the old place (on the service)
to looking in the new place (on the branding).