Commit Graph

7181 Commits

Author SHA1 Message Date
Chris Hill-Scott
625aad97c9 Limit search by recipient to 50 results
Things could get ugly if you use a short search string on a service with
lots of notifications…
2020-05-06 11:44:35 +01:00
Chris Hill-Scott
9f6bfb1b4e Merge pull request #2824 from alphagov/stop-searching-to-field
Stop searching the to field
2020-05-05 16:20:17 +01:00
Leo Hemsted
7e2381e719 Merge pull request #2809 from alphagov/bump-utils-38.0.0
Bump utils to 39.0.0
2020-05-05 13:08:28 +01:00
Leo Hemsted
6f48c05bd3 Merge pull request #2834 from alphagov/fix-email-auth-timeout
restore email sign in code expiry functionality
2020-05-05 13:02:10 +01:00
Leo Hemsted
2b7e05d4e4 restore email sign in code expiry functionality
reverts 789112a31f

however, keeps the changes to the tests as they were an improvement
2020-05-05 12:00:36 +01:00
Leo Hemsted
f67625a9fa Merge pull request #2831 from alphagov/email-expiry
don't expire email sign in codes on use
2020-05-04 12:15:09 +01:00
Leo Hemsted
789112a31f don't expire email sign in codes on use
we're seeing issues with email clients sniffing links, and causing them
to expire before the user gets a chance to click on them. Temporarily
disable the expiry while we work on a more permanent solution.

The link will still expire after half an hour, and sms codes aren't
affected by this change
2020-05-04 12:01:57 +01:00
Katie Smith
a828787514 Merge pull request #2827 from alphagov/pending-notis-in-report
Include pending notifications in monthly data by service report
2020-05-04 09:02:05 +01:00
Chris Hill-Scott
85fc601886 Tell template preview to allow international letters
If a service has permission to send international letters then it should
tell template preview, so that template preview knows what rule to
apply when it’s validating the address of the letter.

Depends on:
- [ ] https://github.com/alphagov/notifications-template-preview/pull/445
2020-05-01 14:37:24 +01:00
Chris Hill-Scott
0c1373eeb5 Bump utils to 39.0.0
`allow_international_letters` is a new, required argument, so the tests
that make some `Row` objects need to provide that.

There are now 8 possible address columns (7 plus postcode) so the tests
need to expect that. But this won’t have any user-facing impact.
2020-05-01 14:37:24 +01:00
Chris Hill-Scott
ba0d330593 Allow countries in last line of addresses
For services that have permission to send international letters we
should not reject letters that are addressed to another country. We
should still reject letters that are badly-addressed.
2020-05-01 14:37:24 +01:00
Chris Hill-Scott
e366ad29ae Bump utils to 38.0.0
Brings in breaking change to how the `RecipientCSV` class should be
instantiated.
2020-05-01 14:37:23 +01:00
Rebecca Law
16734595e7 Merge pull request #2828 from alphagov/fix-date-for-inbound-sms
Fix provider date for inbound SMS
2020-05-01 09:11:57 +01:00
Rebecca Law
b8283c31d4 When we get an inbound message from MMG, the function format_mmg_datetime was converting the date to UTC, however, the provider date is already in UTC format. 2020-04-30 14:19:08 +01:00
Katie Smith
5b8b0dfc6e Include pending notifications in monthly data by service report
The '/service/monthly-data-by-service` endpoint which is used for the
'Monthly notification statuses for live services' Platform Admin report
did not including `pending` notifications. This updates the DAO function
that the endpoint calls to group `pending` and `sending` notifications together.
2020-04-30 13:53:01 +01:00
Chris Hill-Scott
1e20cb6be9 Stop searching the to field
We were doing this temporarily while the `normalised_to` field was not
populated for letters. Once we have a week of data in the
`normalised_to` field we can stop looking in the `to` field. This should
improve performance because:
- it’s one `WHERE` clause not two
- we had to do `ilike` on the `to` field, because we don’t lowercase its
  contents – even if the two where clauses are somehow paralleized it’s
  is slower than a `like` comparison, which is case-sensitive

Depends on:
- [ ] Tuesday 5 May (7 full days after deploying https://github.com/alphagov/notifications-api/pull/2814)
2020-04-30 10:51:30 +01:00
David McDonald
1d3f9589ea Merge pull request #2821 from alphagov/celery-settings-for-reporting
Reduce concurrency and optimise prefetching behaviour for the celery reporting app
2020-04-28 11:13:21 +01:00
David McDonald
466c920e60 Merge pull request #2822 from alphagov/extra-logging-reporting-tasks
Add in lots of logging to our reporting tasks
2020-04-28 11:00:48 +01:00
David McDonald
a237162106 Reduce concurrency and prefetch count of reporting celery app
We have seen the reporting app run out of memory multiple times when
dealing with overnight tasks. The app runs 11 worker threads and we
reduce this to 2 worker threads to put less pressure on a single
instance.

The number 2 was chosen as most of the tasks processed by the reporting
app only take a few minutes and only one or two usually take more than
an hour. This would mean with 2 processes across our current 2
instances, a long running task should hopefully only wait behind a few
short running tasks before being picked up and therefore we shouldn't
see large increase in overall time taken to run all our overnight
reporting tasks.

On top of reducing the concurrency for the reporting app, we also set
CELERYD_PREFETCH_MULTIPLIER=1. We do this as suggested by the celery
docs because this app deals with long running tasks.
https://docs.celeryproject.org/en/3.1/userguide/optimizing.html#optimizing-prefetch-limit

The chance in prefetch multiplier should again optimise the overall time
it takes to process our tasks by ensuring that tasks are given to
instances that have (or will soon have) spare workers to deal with them,
rather than committing to putting all the tasks on certain workers in
advance.

Note, another suggestion for improving suggested by the docs for
optimising is to start setting `ACKS_LATE` on the long running tasks.
This setting would effectively change us from prefetching 1 task per
worker to prefetching 0 tasks per worker and further optimise how we
distribute our tasks across instances. However, we decided not to try
this setting as we weren't sure whether it would conflict with our
visibility_timeout. We decided not to spend the time investigating but
it may be worth revisiting in the future, as long as tasks are
idempotent.

Overall, this commit takes us from potentially having all 18 of our
reporting tasks get fetched onto a single instance to now having a
process that will ensure tasks are distributed more fairly across
instances based on when they have available workers to process the
tasks.
2020-04-28 10:47:46 +01:00
David McDonald
5d88a1dbf4 Add -Ofair setting to reporting celery app
What this setting does is best described in
https://medium.com/@taylorhughes/three-quick-tips-from-two-years-with-celery-c05ff9d7f9eb#d7ec

This should be useful for the reporting app because tasks run by this
app are long running (many seconds). Ideally this code change will
mean that we are quicker to process the overnight reporting tasks,
so they all finish earlier in the morning (although are not individually quicker).

This is only being set on the reporting celery app because this
change trying to do the minimum possible to improve the reliability and speed
of our overnight reporting tasks. It may very well be useful to set this
flag on all our apps, but this should be done with some more
consideration as some of them will deal with much faster tasks (sub
0.5s) and so it may be still be appropriate or may not. Proper
investigation would be needed.

Note, the celery docs on this are also worth a read:
https://docs.celeryproject.org/en/3.1/userguide/optimizing.html#optimizing-prefetch-limit.
However, the language can confuse this with setting with the prefetch
limit. The distinction is that prefetch grabs items off the queue, whereas the
-Ofair behaviour is to do with when items have already been prefetched
and then whether the master celery process straight away gives them to
the child (worker) processes or not.

Note, this behaviour is default for celery version 4 and above but we
are still on version 3.1.26 so we have to enable it ourselves.
2020-04-28 10:42:42 +01:00
David McDonald
7a9c756117 Add in lots of logging to our reporting tasks
We've seen only some of these reporting tasks happen but with no log
messages to indicate what happened and no app crashes. This hopefully
will give us a better picture of a timeline.

Note, I've tried to make our message format very consistent and good for
searching for in kibana so I've changed that across this whole file for
consistency.
2020-04-28 10:38:53 +01:00
Chris Hill-Scott
26793899d4 Merge pull request #2814 from alphagov/search-letters
Let users search for letters
2020-04-27 15:06:32 +01:00
Leo Hemsted
bd345d0390 Merge pull request #2818 from alphagov/restart-periodic-worker
restart reporting worker after each task
2020-04-27 11:03:07 +01:00
Leo Hemsted
1b8c683194 Merge pull request #2816 from alphagov/autoscaling-fixes
make create_letters_pdf always use right queue
2020-04-27 11:03:00 +01:00
David McDonald
985dddcfdc Merge pull request #2820 from alphagov/six-fragments
Bring in utils to bump max fragments to 6 for SMS
2020-04-24 16:11:58 +01:00
David McDonald
44155d4e7c Bring in utils to bump max fragments to 6 for SMS 2020-04-24 16:01:59 +01:00
Pea M. Tyczynska
6724b2cd74 Merge pull request #2817 from alphagov/update_healthcheck_page
Update _status page - remove references to Travis
2020-04-24 14:09:44 +01:00
Pea Tyczynska
260fd7940d Update healthcheck page - remove travis references
Also remove travis references from the repository
2020-04-24 13:43:00 +01:00
Leo Hemsted
ad419f7592 restart reporting worker after each task
The reporting worker tasks fetch large amounts of data from the db, do
some processing then store back in the database. As the reporting worker
only processes the create nightly billing/stats table tasks, which
aren't high performance or high volume, we're fine with the performance
hit from restarting the worker between every task (which based on
limited local testing takes about a second or so).

This causes some real funky shit with the app_context (used for
accessing current_app.logger). To access flask's global state we use the
standard way of importing `from flask import current_app`. However,
processes after the first one don't have the current_app available on
shut down (they're fine during the actual task running), and are unable
to call `with current_app.app_context()` to create it. They _are_ able
to call `with app.app_context()` to create it, where `app` is the
initial app that we pass in to `NotifyCelery.init_app`.

NotifyCelery.init_app is only called once, in the master process - I
think the application state is then stored and passed to the celery
workers. But then it looks like the teardown might clear it, but it
never gets set up again for the new workers? Unsure.

To fix this, store a copy of the initial flask app on the NotifyCelery
object and then use that from within the shutdown signal logging
function.

Nothing's ever easy ¯\_(ツ)_/¯
2020-04-24 12:28:25 +01:00
Chris Hill-Scott
64674be817 Make allowed notification types explicit in search
By not having a catch-all else, it makes it clearer what we’re
expecting. And then we think it’s worth adding a comment explaining why
we normalise as we do for letters and the `None` case.
2020-04-23 16:06:34 +01:00
Chris Hill-Scott
11008af96a Remove outdated comment 2020-04-23 15:35:25 +01:00
Leo Hemsted
d59b2f4cc9 make create_letters_pdf always use right queue
create-letters-pdf-tasks is for contacting template preview.
letters-tasks is for doing other things around letters (including putting things on template preview queue)
2020-04-22 16:07:51 +01:00
Chris Hill-Scott
9047b273da Save whole letter address into the to field
At the moment we’re not consistent:

Precompiled (API and one-off):
`to` has the whole address
`normalised_to` has nothing

Templated (API, CSV and one off):
`to` has the first line of the address
`normalised_to` has nothing

This commit makes us consistently store the whole address in the `to`
field. We think that people might want to search by postcode, not just
first line of the address.

This commit also starts to populate the normalised_to field with the
address lowercased and with all spaces removed, to make it easier to
search on.
2020-04-22 10:06:25 +01:00
Chris Hill-Scott
7897ae70ce Let users search for letters
Like we have search by email address or phone number, finding an
individual letter is a common task. At the moment users are having to
click through pages and pages of letters to find the one they’re looking
for.

We have to search in the `to` and `normalised_to` fields for now because
we’re not populating the `normalised_to` column for letters at the
moment.
2020-04-21 16:25:37 +01:00
Pea M. Tyczynska
d88b20beec Merge pull request #2813 from alphagov/firetext-check-code-on-failure-only
Check failure code on failure only.
2020-04-21 15:23:26 +01:00
Pea Tyczynska
9782cbc5b5 Check failure code on failure only.
We should check error code on failure only, because if we get an
error code on pending code, we should still set the notification
to pending status.
2020-04-21 13:43:45 +01:00
Pea M. Tyczynska
750a573afc Merge pull request #2812 from alphagov/firetext-000-code
Take into consideration Firetext 000 code - no error.
2020-04-21 12:21:36 +01:00
Pea Tyczynska
385d249787 Take into consideration Firetext 000 code - no error. Also change logs from error to warning 2020-04-21 12:06:28 +01:00
David McDonald
5011b5b8f5 Merge pull request #2811 from alphagov/request-time-monitoring
Add GDSMetrics package
2020-04-21 11:20:43 +01:00
Pea M. Tyczynska
850a56ab04 Merge pull request #2803 from alphagov/firetext-response-codes
Use firetext response code to see if temporary or permanent failure if available
2020-04-21 10:50:46 +01:00
David McDonald
1ff52bbaad Add GDSMetrics package
As per instructions https://github.com/alphagov/gds_metrics_python

The celery workers don't have an HTTP endpoint so no point in trying to
get prometheus to scrape them.
2020-04-20 18:39:45 +01:00
Pea Tyczynska
b962dcac5d Check Firetext code first to determine status and log the reason
If there is no code, fall back to old way of determining status.
If the code is not recognised, log an Error so we are notified
and can look into it.
2020-04-20 18:33:19 +01:00
Pea M. Tyczynska
69864611ef Merge pull request #2810 from alphagov/fix_preview
Fix bug where preview for templated letters would not show
2020-04-20 15:49:36 +01:00
Pea Tyczynska
7a89b1b835 Fix bug where preview for templated letters would not show 2020-04-20 15:35:37 +01:00
Katie Smith
c0343324c4 Merge pull request #2808 from alphagov/reset-providers
Restore provider resting points
2020-04-20 10:56:44 +01:00
Katie Smith
92c979fcdb Restore provider resting points
We temporarily updated the provider resting points in
https://github.com/alphagov/notifications-api/pull/2804.
This puts the provider resting points back to their original value now
that both providers seem to be functioning well.
2020-04-20 10:45:19 +01:00
Pea M. Tyczynska
d15db5702c Merge pull request #2807 from alphagov/make-sure-letter-personalisation-is-string
Make sure letter personalisation is string - pull utils patch
2020-04-17 18:29:17 +01:00
Pea Tyczynska
f27e544911 Make sure letter personalisation is string - pull utils patch 2020-04-17 13:42:57 +01:00
Chris Hill-Scott
47b5ae9dc1 Merge pull request #2801 from alphagov/use-new-utils-template-properties
Use new properties of utils Templates
2020-04-17 13:41:45 +01:00
Pea Tyczynska
fef03920f0 Add test for a callback with a code 2020-04-17 10:58:26 +01:00