previously we checked notifications table, and if the results were
zero, checked the notification history table to see if there's data
in there. When we know that data isn't in notifications, we're still
checking. These queries take half a second per service, and we're
doing at least ten for each of the five thousand services we have in
notify. Most of these services have no data in either table for any
given day, and we can reduce the amount of queries we do by only
checking one table.
Check the data retention for a service, and then if the date is older
than the retention, get from history table.
NOTE: This requires that the delete tasks haven't run yet for the day!
If your retention is three days, this will look in the Notification
table for data from three days ago - expecting that shortly after the
task finishes, we'll delete that data.
it's not acceptable for a constantly failing provider to take 50 minutes
to drain (5x reducing priority by 10). But similarly, we need _some_
delay, or a handful of concurrent failures will completely turn off a
provider, rendering the whole excercise kinda pointless. Setting the
delay before it tries to reduce priority again to one minute is nice
because it means that if one request times out and returns 502, then any
other requests that are in flight at that time will time out before the
one minute is up and not switch, but any requests made after the switch
that take sixty seconds to time out will affect it.
when ORM level changes are made (eg `my_model.my_column = my_value`),
the ORM will read the column definition to see if it should apply any
defaults.The updated_at columns that we use all define
`onupdate=datetime.datetime.utcnow`. We can't patch this out as the
function pointer to the original function has already been grabbed by
this at import time - so freezegun or `mocker.patch` won't work.
So we have to use the query syntax to set the `updated_at` timestamp in
the DB without going through the ORM layer.
making sure that we don't close the transaction early, because we need
to keep the transaction open as it has the with_for_update clause on the
select to lock the table.
also make sure the tests clean up after themselves as they're adding
history rows etc
it now only needs to be used when you're:
* updating providers in ways that will create history (eg through
regular api calls)
* altering more than just priority in test setup (eg setting inactive,
deleting, or adding a provider)
we randomly choose between sms providers now - this means that tests may
sometimes send firetext and sometimes mmg, so we'd need to patch out
different HTTP calls, expect different values in sent_by, etc etc.
To ensure tests are consistent, add a new fixture that is always used by
notify_db_session, which sets the priorities of the sms providers to
100% mmg 0% firetext. if you need to test other values, then you should
set the values manually in the test file
the function no longer makes sense now that we send through both at
the same time. mostly just used in old tests that we'll end up rewriting
shortly anyway
retrive the sms providers from the DB, and decrease the chosen
provider's priority by 10, while increasing the other by 10.
add a check in to ensure we never decrease below 0 or increase above 100
- this is per provider, we don't check that the two add up to 100 or
anything. If the values are outside of this range (eg: set via the UI)
then they'll probably* fix themselves at some point - we've added tests
to document these cases.
Use with_for_update to ensure that the method can only run once at a
time - other invocations of the function will be held on that line until
the currently running one ends and commits the transaction. This doesn't
affect anyone doing things from the UI.
The group by for the query was wrong which would result in 2 rows with different totals but the same unique key, so the second row would update the first row. Meaning we had incorrect numbers for the billing data.
Because some of the data had null for the sent_by column, the select would turn the Null --> dvla, but that same function was not used in the group by. So any time we had missing sent_by data we would end up with 2 rows where one would overwrite the other.
This is only necessary because there is currently a job that is old, but had 1 row created a couple days later. So now there is 1 notifications for the job where the rest have been purged.
Sometimes a job finishes but has missed a row in the middle. It is a mystery why this is happening, it could be that the task to save the notifications has been dropped.
So until we solve the missing let's find missing rows and process them.
A new scheduled task has been added to find any "finished" jobs that do not have enough notifications created. If there are missing notifications the job processes those rows for the job.
Adding the new task to beat schedule will be done in the next commit.
A unique key constraint has been added to Notifications to ensure that the row is not added twice. Any index or constraint can affect performance, but this unique constraint should not affect it enough for us to notice.
Since Pytest 5, `ExceptionInfo` objects (returned by `pytest.raises`) now
have the same `str` representation as `repr`. This means that `str(e)`
now needs to be changed to `str(e.value)`.
https://github.com/pytest-dev/pytest/issues/5412
The nightly job to delete email notifications was failing because it was
timing out (`psycopg2.errors.QueryCanceled: canceling statement due to statement timeout`).
This adds a query limit to the query which inserts or updates
notification history so that it only updates a maximum of 10000 rows at
a time.
All our endpoint should perform a check that the params are valid - this is an easy whay to check that and is standard for our endpoints.
I reverted the query to just filter by job id.
When a service asks for branding I often go and:
- set the branding on the organisation as well
- set the branding on all the organisation’s services
The latter can be quite time consuming, but it does save effort since
existing services from the same organisation won’t have to request
branding. It also improves the consistency of the communications that
users are receiving.
This commit automates that process by applying the branding update to
any services belonging to the organisation, if those services don’t
already have their own custom branding set up.
Now we consistently use the created_at date, so we can always get the right file location and name.
The previous updates to this code were trying to solve the problem if a pdf being created at 17:29, but not ready to upload until 17:31 after the antivirus and validation check.
But in those cases we would have trouble finding the file.
When we cancel a job, we need to check if all notifications are
already in the database. So far, we were querying for all
notification objects in the database and counting them in
admin app, which runs into pagination problems for large jobs,
and could time out for very large jobs.
Code that is within a `with Python.raises(...)` context manager but
comes after the line that raises the exception doesn't get evaluated.
We had some assertions that we never being tested because of this, so
this ensures that they will always get run and fixes them where
necessary.
bst_date is a date field. Comparing dates with datetimes in postgres
gets confusing and dangerous. See this example, where a date evaluates
as older than midnight that same day.
```
notification_api=# select '2019-04-01' >= '2019-04-01 00:00';
?column?
----------
f
(1 row)
```
By only using dates everywhere, we reduce the chance of these bugs
happening
Previously we were doing it based on their email address. This will also
apply it if they self-select as a GP surgery, even if they don’t have an
NHS email address.
This is the second commit in the series to add organisation_id to Service.
- Data migration to update services.organisation_id from data in organisation_to_service
(The rollback will lose any updates to organisation unless the script is updated to set organistion_to_service from service.organisation_id )
- Update Service.organisation relationship to a ForeignKey relationship to Organisation.
- Update Organisation.services to a backref relationship to Service.
- Add oranisation_id to Service data model.
- Update methods to create service and associate service to organisation to set the organisation_id on the Service.
- Create the missing test, if the service user email matches a domain for an organisation then associate the service to the organisation and inherit crown and organisation_type from the organisation.
For example if a file is uploaded on Aug 1, but scheduled for Aug 3, the csv file will be deleted 2 days before the notifications.
This will cause an error for on the jobs pages if the download report link is clicked.