Commit Graph

330 Commits

Author SHA1 Message Date
David McDonald
52d3df49d4 Make ADMIN_CLIENT_SECRET a list of a single secret
And support this change across our code. Note, this is a halfway step
where it is not a list rather than a string but still only supports a
single secret, ie one item in the list.
2020-02-20 13:43:10 +00:00
David McDonald
a14d5f0225 Remove task that no longer runs
We no longer puts files in these s3 buckets (and have in fact deleted
the buckets) therefore this task is redundant and can be removed.
2020-02-06 10:57:43 +00:00
Rebecca Law
f4c0f70ba9 Send the alert for letters-still-sending an hour earlier.
These alerts are sent to our postal provider. And it usually arrives as they are getting ready to go home for the day or the weekend.
Which means they get missed/overlooked. They have agreed to get the alert an hour earlier, perhaps that will improved the response time.
2020-01-13 10:42:30 +00:00
Katie Smith
8f144be29c Add config for new template preview task
Added the queue and task names for the new template preview task to the
config. Also added the new bucket name that template preview will use
for the sanitised letters to the config for all environments.
2019-12-16 11:30:56 +00:00
Leo Hemsted
31d1abd6d1 add task to move sms providers back towards shared load
we generally aim to share the load between the two providers equally
(more or less). When one provider has struggled, we deprioritise them,
this commit adds a function that gradually restores balance. It checks
every five minutes, if it's been more than an hour since the providers
were last changed then it adjusts them towards a 50/50 split. Except
it's not quite 50/50 due to #reasons (we want to slightly favour MMG),
it's actually 60/40. That's defined in a new dict in config.py.
2019-12-13 10:02:39 +00:00
Pea M. Tyczynska
2019070536 Merge pull request #2667 from alphagov/warn-team-about-high-failure-rates
Warn team about high failure rates
2019-12-09 11:28:25 +00:00
Pea Tyczynska
d72ab4f4a6 Send zendesk ticket when services found with high failure rates 2019-12-06 16:57:04 +00:00
Leo Hemsted
4701e5d9af don't define MMG_URL and FIRETEXT_URL in manifest
these URLs never change, and it lead to surprising issues where an
updated default MMG_URL wasn't actually respected on PaaS. These urls
aren't private and don't need to be stored in credentials.

By not defining them in the manifest, we expect them to use the default
unless `cf set-env` has been specifically used to modify them in an app.
2019-12-04 15:26:49 +00:00
Leo Hemsted
6b9afa358f update utils to bring in full welsh diacritics range
note: this includes updating the MMG api url to their v2a api. Their
previous API doesn't include support for capital o with grave accent
(Ò)
2019-11-28 15:12:52 +00:00
Rebecca Law
9def176e7a Fix typo in config 2019-11-06 10:56:37 +00:00
Rebecca Law
74546a265e Added cron for check_for_missing_rows_in_completed_jobs
Run the task every 10 minutes. Does that seem reasonable? Maybe that is too often.
2019-11-06 10:49:46 +00:00
Leo Hemsted
e094dd4bfd remove loadtesting from providers
we don't use it since we wrote our own provider stubs for performance
tests.

this removes it from the api - it's still in the DB and will be
retrieved by queries, but is set to disabled on prod
2019-10-23 11:45:07 +01:00
Katie Smith
a241fe4a29 Add transient uploaded letters bucket to config 2019-09-12 09:56:10 +01:00
Leo Hemsted
1d9fd775d3 move delete tasks to 4am
just to make sure they definitely run after the create tasks
2019-08-21 11:15:49 +01:00
Leo Hemsted
8f13697cf1 Revert "trigger nightly delete tasks from the create notification status task"
This reverts commit 58f24a0a83.
2019-08-19 16:06:25 +01:00
Leo Hemsted
92d78956be Merge pull request #2592 from alphagov/reporting-worker
Add reporting worker
2019-08-15 17:22:27 +01:00
Leo Hemsted
3a0bf2b23e Add reporting worker
also remove references to unused statistics queue
2019-08-15 16:42:15 +01:00
Leo Hemsted
58f24a0a83 trigger nightly delete tasks from the create notification status task
the nightly tasks need to run after the create nightly notification
status task - so that test notifications are still there to record
stats for, and to stop the risk of deleting notificaitons part-way
through recording stats for them.
2019-08-14 18:04:45 +01:00
Leo Hemsted
7b8028d03f fix typo in config
was a `,`, not a `:`, so 'options' was a set rather than a dictionary.
2019-08-13 15:19:28 +01:00
Leo Hemsted
2b06e810c5 Lower the max dvla zip size from 500mb to 40mb
There's a bug in pysftp that appears to cause quadratic performance loss. See https://github.com/paramiko/paramiko/issues/1141 for more details.

As a temporary band-aid fix, lower the size of the files we're sending.
2019-07-29 17:23:22 +01:00
Katie Smith
cec87a9de0 Delete unused code
* The `_should_record_notification_in_history_table` function stopped being
used in this commit: c23ae15f32
* `NOTIFICATIONS_ALERT` stopped being used in this commit: 5aa37f09b6
2019-07-12 16:43:37 +01:00
Leo Hemsted
07bb0f0332 send emails when MOU is signed
we build up one personalisation dict, and then pass it in to all the
different templates - so be careful editing things. also of note, we
check if the agreement_signed_on_behalf_of is set, and send a different
template with slightly different wording to the person who clicked the
confirm button.
2019-07-12 15:08:55 +01:00
Katie Smith
c518f6ca76 Add scheduled task to find old letters which still have 'created' status
Added a scheduled task to run once a day and check if there were any
letters from before 17.30 that still have a status of 'created'. This
logs an exception instead of trying to fix the error because the fix
will be different depending on which bucket the letter is in.
2019-06-18 10:58:58 +01:00
Katie Smith
a2f324ad7e Add scheduled task to find precompiled letters in wrong state
Added a task which runs twice a day on weekdays and checks for letters that have
been in the state of `pending-virus-check` for over 90 minutes. This is
just logging an exception for now, not trying to fix things, since we
will need to manually check where the issue was.
2019-06-18 10:58:58 +01:00
Pea Tyczynska
5f1f688c7b Create template to verify service email reply-to addresses
So that template with the same ID is present on all environments
2019-05-28 15:14:09 +01:00
Pea Tyczynska
615ea6a98a Send verifcation email to a new reply-to email address 2019-05-23 15:36:09 +01:00
Alexey Bezhan
0138eb0cae Make statsd host configurable with an env variable
Setting `STATSD_HOST` for an env variable allows us to switch to a
local statsd_exporter on a per-app basis.

This also changes `STATSD_ENABLED` to be on when `STATSD_HOST` is set,
avoiding the need to set it separately.
2019-04-24 13:50:13 +01:00
Alexey Bezhan
330afab5e2 Make Firetext URL configurable through the application environment
Similar to MMG, there's a new env variable FIRETEXT_URL that can be
used to override the Firetext api URL.

This will be used to stub out both providers during the load test or
can be used to run a local API against a fake provider endpoint.
2019-04-12 12:03:58 +01:00
Rebecca Law
be9daf3454 Until we can fix it properly, changing the max number of files to 500. Hopefully the task will finish in less than 5 minutes. 2019-03-18 13:13:05 +00:00
Rebecca Law
94e0b8b4eb Reduce the number of files sent to the zip-and-send-letter-pdfs
The ftp application is struggling, running out of CPU. This is attempt to help with that.
2019-03-15 15:46:09 +00:00
Leo Hemsted
26243cd2b0 Merge pull request #2377 from alphagov/skip-antivirus
stub out antivirus in dev
2019-02-27 13:38:01 +00:00
Leo Hemsted
653f1ab6b9 stub out antivirus in dev
antivirus is sometimes tough to get running locally - now in dev
antivirus is skipped unless `ANTIVIRUS_ENABLED=1` is set on the command
line. on all other environments it is always enabled.
2019-02-27 10:59:31 +00:00
Pea Tyczynska
211d3741ba Send confirmation emails to users when team manager edits their
email address  or mobile number.
2019-02-26 16:30:29 +00:00
Leo Hemsted
754c65a6a2 create cronitor decorator that alerts if tasks fail
make a decorator that pings cronitor before and after each task run.
Designed for use with nightly tasks, so we have visibility if they
fail. We have a bunch of cronitor monitors set up - 5 character keys
that go into a URL that we then make a GET to with a self-explanatory
url path (run/fail/complete).

the cronitor URLs are defined in the credentials repo as a dictionary
of celery task names to URL slugs. If the name passed in to the
decorator  isn't in that dict, it won't run.

to use it, all you need to do is call `@cronitor(my_task_name)`
instead of `@notify_celery.task`, and make sure that the task name and
the matching slug are included in the credentials repo (or locally,
json dumped and stored in the CRONITOR_KEYS environment variable)
2019-01-18 15:36:53 +00:00
Leo Hemsted
d3d56a3224 separate nightly tasks and other scheduled tasks.
other tasks is anything that is run on a different frequency than
nightly
2019-01-18 15:36:53 +00:00
Rebecca Law
efad58edd8 There is no need to have a separate table to store template monthly statistics. It's easy enough to aggregate the stats from ft_notification_status.
This removes the nightly task, and all the dao methods.
The next PR will remove the table.
2019-01-14 16:30:36 +00:00
Alexey Bezhan
4a26ee1813 Set statement timeout on all DB connections
A recent issue with a long-running query (#2288) highlighted the
fact that even though the original HTTP connection might be closed
(for example after gorouter timeout of 15 minutes, which returns a
504 response to the client), the request worker will not be stopped.

This means that the worker is spending time and potentially DB
resources generating a response that will never be delivered.

Gunicorn's timeout setting only applies to sync workers and there
doesn't seem to be an option to interrupt individual requests in
gevent/eventlet deployments.

Since the most likely (and potentially most dangerous) scenario for
this is a long-running DB query, we can set a statement timeout on
our DB connections. This will raise a sqlalchemy.exc.OperationalError
(wrapping psycopg2.extensions.QueryCanceledError), interrupting the
request after the given timeout has been reached.

This is a Postgres client setting, so the database itself will abort
the transaction when it reaches the set timeout.

Since this will also apply to our celery tasks (including potentially
long-running nightly tasks) we set a timeout of 20 minutes to begin
with.

This can potentially be split in the future to set a different value
for each app, so that we could limit API requests even more.
2019-01-09 14:36:50 +00:00
Pea (Malgorzata Tyczynska)
d7fcd564e0 Merge pull request #2250 from alphagov/switch_providers_update
Update switch providers on slow delivery method and query
2018-12-11 10:27:29 +00:00
Pea Tyczynska
5ed7564066 Remove unused config variables
We don't use FUNCTIONAL_TEST_PROVIDER_SERVICE_ID or
UNCTIONAL_TEST_PROVIDER_SMS_TEMPLATE_ID anymore so we can safely
delete them from config and tests.
2018-12-10 17:25:53 +00:00
Alexey Bezhan
1d4e4eae94 Disable unused SQLAlchemy configuration flags
We don't seem to use recorded queries or modification tracking anywhere
in the app, and both features potentially increase memory usage.

This removes deprecated SQLALCHEMY_COMMIT_ON_TEARDOWN options. It's
been removed from the docs and the default matches the value we set
anyway.
2018-12-04 12:04:18 +00:00
Pea Tyczynska
e5fd027192 Move nightly tasks before introduction of archived flag on jobs 2018-11-28 14:38:59 +00:00
Katie Smith
ff06d120e8 Bump notifications-utils to 3.7.0
Bumped notifications-utils to 3.7.0. Version 3.7.0 includes the
`convert_utc_to_bst` and `convert_bst_to_utc` functions and the
`LETTER_PROCESSING_DEADLINE` constant, so these have been removed from
this repo and anywhere using these has now been updated to get these
from `notifications-utils`.

Also bumped pytest by a patch version to bring in a bug fix.
2018-11-26 12:53:39 +00:00
Alexey Bezhan
1b1b443c69 Rename staging CSV uploads bucket to match other environments 2018-11-21 14:07:04 +00:00
Rebecca Law
7a16ac35bd Remove letter-jobs api
When we first built letters you could only send them via a CSV upload, initially we needed a way to send those files to dvla per job.
We since stopped using this page. So let's delete it!
2018-11-15 17:24:37 +00:00
Pea Tyczynska
987445f1bf ft_notification_status now updates data for 4 days back
This was done so when notification is timed out from sending/pending
to temporary_failure, this change has to always be caught
in the ft_notification_status
2018-11-08 11:52:40 +00:00
Pea Tyczynska
992ef259b2 Rearrange nightly tasks to maintain the correct order
Timeout sending notifications updates up to 4 days of notifications
older than 72 hours to correct failure status. It needs to run before
we update ft_notifications_status table. Otherwise the changes
don't get picked up. Notifications deletion tasks have to run
after those jobs in case our users set short data retention
policy.
2018-11-07 16:43:28 +00:00
Leo Hemsted
162ed0ba5f remove dupes from cfg 2018-10-17 15:29:39 +01:00
Pea Tyczynska
e22e7245fe Use sanitised pdfs for sending and handle invalid pdfs, details below:
- pass new, sanitised pdf for sending
- move invalid pdfs to a newly created bucket
- set status fro notifications that failed pdf validation to a new status validation-failed
- adjust existing tests
2018-10-16 17:30:35 +01:00
Leo Hemsted
2ed50e760f Revert "Celery 4" 2018-10-09 13:27:49 +01:00
Leo Hemsted
e0551547d2 wait 20 seconds between long polls 2018-10-03 14:11:30 +01:00