Commit Graph

318 Commits

Author SHA1 Message Date
Katie Smith
a241fe4a29 Add transient uploaded letters bucket to config 2019-09-12 09:56:10 +01:00
Leo Hemsted
1d9fd775d3 move delete tasks to 4am
just to make sure they definitely run after the create tasks
2019-08-21 11:15:49 +01:00
Leo Hemsted
8f13697cf1 Revert "trigger nightly delete tasks from the create notification status task"
This reverts commit 58f24a0a83.
2019-08-19 16:06:25 +01:00
Leo Hemsted
92d78956be Merge pull request #2592 from alphagov/reporting-worker
Add reporting worker
2019-08-15 17:22:27 +01:00
Leo Hemsted
3a0bf2b23e Add reporting worker
also remove references to unused statistics queue
2019-08-15 16:42:15 +01:00
Leo Hemsted
58f24a0a83 trigger nightly delete tasks from the create notification status task
the nightly tasks need to run after the create nightly notification
status task - so that test notifications are still there to record
stats for, and to stop the risk of deleting notificaitons part-way
through recording stats for them.
2019-08-14 18:04:45 +01:00
Leo Hemsted
7b8028d03f fix typo in config
was a `,`, not a `:`, so 'options' was a set rather than a dictionary.
2019-08-13 15:19:28 +01:00
Leo Hemsted
2b06e810c5 Lower the max dvla zip size from 500mb to 40mb
There's a bug in pysftp that appears to cause quadratic performance loss. See https://github.com/paramiko/paramiko/issues/1141 for more details.

As a temporary band-aid fix, lower the size of the files we're sending.
2019-07-29 17:23:22 +01:00
Katie Smith
cec87a9de0 Delete unused code
* The `_should_record_notification_in_history_table` function stopped being
used in this commit: c23ae15f32
* `NOTIFICATIONS_ALERT` stopped being used in this commit: 5aa37f09b6
2019-07-12 16:43:37 +01:00
Leo Hemsted
07bb0f0332 send emails when MOU is signed
we build up one personalisation dict, and then pass it in to all the
different templates - so be careful editing things. also of note, we
check if the agreement_signed_on_behalf_of is set, and send a different
template with slightly different wording to the person who clicked the
confirm button.
2019-07-12 15:08:55 +01:00
Katie Smith
c518f6ca76 Add scheduled task to find old letters which still have 'created' status
Added a scheduled task to run once a day and check if there were any
letters from before 17.30 that still have a status of 'created'. This
logs an exception instead of trying to fix the error because the fix
will be different depending on which bucket the letter is in.
2019-06-18 10:58:58 +01:00
Katie Smith
a2f324ad7e Add scheduled task to find precompiled letters in wrong state
Added a task which runs twice a day on weekdays and checks for letters that have
been in the state of `pending-virus-check` for over 90 minutes. This is
just logging an exception for now, not trying to fix things, since we
will need to manually check where the issue was.
2019-06-18 10:58:58 +01:00
Pea Tyczynska
5f1f688c7b Create template to verify service email reply-to addresses
So that template with the same ID is present on all environments
2019-05-28 15:14:09 +01:00
Pea Tyczynska
615ea6a98a Send verifcation email to a new reply-to email address 2019-05-23 15:36:09 +01:00
Alexey Bezhan
0138eb0cae Make statsd host configurable with an env variable
Setting `STATSD_HOST` for an env variable allows us to switch to a
local statsd_exporter on a per-app basis.

This also changes `STATSD_ENABLED` to be on when `STATSD_HOST` is set,
avoiding the need to set it separately.
2019-04-24 13:50:13 +01:00
Alexey Bezhan
330afab5e2 Make Firetext URL configurable through the application environment
Similar to MMG, there's a new env variable FIRETEXT_URL that can be
used to override the Firetext api URL.

This will be used to stub out both providers during the load test or
can be used to run a local API against a fake provider endpoint.
2019-04-12 12:03:58 +01:00
Rebecca Law
be9daf3454 Until we can fix it properly, changing the max number of files to 500. Hopefully the task will finish in less than 5 minutes. 2019-03-18 13:13:05 +00:00
Rebecca Law
94e0b8b4eb Reduce the number of files sent to the zip-and-send-letter-pdfs
The ftp application is struggling, running out of CPU. This is attempt to help with that.
2019-03-15 15:46:09 +00:00
Leo Hemsted
26243cd2b0 Merge pull request #2377 from alphagov/skip-antivirus
stub out antivirus in dev
2019-02-27 13:38:01 +00:00
Leo Hemsted
653f1ab6b9 stub out antivirus in dev
antivirus is sometimes tough to get running locally - now in dev
antivirus is skipped unless `ANTIVIRUS_ENABLED=1` is set on the command
line. on all other environments it is always enabled.
2019-02-27 10:59:31 +00:00
Pea Tyczynska
211d3741ba Send confirmation emails to users when team manager edits their
email address  or mobile number.
2019-02-26 16:30:29 +00:00
Leo Hemsted
754c65a6a2 create cronitor decorator that alerts if tasks fail
make a decorator that pings cronitor before and after each task run.
Designed for use with nightly tasks, so we have visibility if they
fail. We have a bunch of cronitor monitors set up - 5 character keys
that go into a URL that we then make a GET to with a self-explanatory
url path (run/fail/complete).

the cronitor URLs are defined in the credentials repo as a dictionary
of celery task names to URL slugs. If the name passed in to the
decorator  isn't in that dict, it won't run.

to use it, all you need to do is call `@cronitor(my_task_name)`
instead of `@notify_celery.task`, and make sure that the task name and
the matching slug are included in the credentials repo (or locally,
json dumped and stored in the CRONITOR_KEYS environment variable)
2019-01-18 15:36:53 +00:00
Leo Hemsted
d3d56a3224 separate nightly tasks and other scheduled tasks.
other tasks is anything that is run on a different frequency than
nightly
2019-01-18 15:36:53 +00:00
Rebecca Law
efad58edd8 There is no need to have a separate table to store template monthly statistics. It's easy enough to aggregate the stats from ft_notification_status.
This removes the nightly task, and all the dao methods.
The next PR will remove the table.
2019-01-14 16:30:36 +00:00
Alexey Bezhan
4a26ee1813 Set statement timeout on all DB connections
A recent issue with a long-running query (#2288) highlighted the
fact that even though the original HTTP connection might be closed
(for example after gorouter timeout of 15 minutes, which returns a
504 response to the client), the request worker will not be stopped.

This means that the worker is spending time and potentially DB
resources generating a response that will never be delivered.

Gunicorn's timeout setting only applies to sync workers and there
doesn't seem to be an option to interrupt individual requests in
gevent/eventlet deployments.

Since the most likely (and potentially most dangerous) scenario for
this is a long-running DB query, we can set a statement timeout on
our DB connections. This will raise a sqlalchemy.exc.OperationalError
(wrapping psycopg2.extensions.QueryCanceledError), interrupting the
request after the given timeout has been reached.

This is a Postgres client setting, so the database itself will abort
the transaction when it reaches the set timeout.

Since this will also apply to our celery tasks (including potentially
long-running nightly tasks) we set a timeout of 20 minutes to begin
with.

This can potentially be split in the future to set a different value
for each app, so that we could limit API requests even more.
2019-01-09 14:36:50 +00:00
Pea (Malgorzata Tyczynska)
d7fcd564e0 Merge pull request #2250 from alphagov/switch_providers_update
Update switch providers on slow delivery method and query
2018-12-11 10:27:29 +00:00
Pea Tyczynska
5ed7564066 Remove unused config variables
We don't use FUNCTIONAL_TEST_PROVIDER_SERVICE_ID or
UNCTIONAL_TEST_PROVIDER_SMS_TEMPLATE_ID anymore so we can safely
delete them from config and tests.
2018-12-10 17:25:53 +00:00
Alexey Bezhan
1d4e4eae94 Disable unused SQLAlchemy configuration flags
We don't seem to use recorded queries or modification tracking anywhere
in the app, and both features potentially increase memory usage.

This removes deprecated SQLALCHEMY_COMMIT_ON_TEARDOWN options. It's
been removed from the docs and the default matches the value we set
anyway.
2018-12-04 12:04:18 +00:00
Pea Tyczynska
e5fd027192 Move nightly tasks before introduction of archived flag on jobs 2018-11-28 14:38:59 +00:00
Katie Smith
ff06d120e8 Bump notifications-utils to 3.7.0
Bumped notifications-utils to 3.7.0. Version 3.7.0 includes the
`convert_utc_to_bst` and `convert_bst_to_utc` functions and the
`LETTER_PROCESSING_DEADLINE` constant, so these have been removed from
this repo and anywhere using these has now been updated to get these
from `notifications-utils`.

Also bumped pytest by a patch version to bring in a bug fix.
2018-11-26 12:53:39 +00:00
Alexey Bezhan
1b1b443c69 Rename staging CSV uploads bucket to match other environments 2018-11-21 14:07:04 +00:00
Rebecca Law
7a16ac35bd Remove letter-jobs api
When we first built letters you could only send them via a CSV upload, initially we needed a way to send those files to dvla per job.
We since stopped using this page. So let's delete it!
2018-11-15 17:24:37 +00:00
Pea Tyczynska
987445f1bf ft_notification_status now updates data for 4 days back
This was done so when notification is timed out from sending/pending
to temporary_failure, this change has to always be caught
in the ft_notification_status
2018-11-08 11:52:40 +00:00
Pea Tyczynska
992ef259b2 Rearrange nightly tasks to maintain the correct order
Timeout sending notifications updates up to 4 days of notifications
older than 72 hours to correct failure status. It needs to run before
we update ft_notifications_status table. Otherwise the changes
don't get picked up. Notifications deletion tasks have to run
after those jobs in case our users set short data retention
policy.
2018-11-07 16:43:28 +00:00
Leo Hemsted
162ed0ba5f remove dupes from cfg 2018-10-17 15:29:39 +01:00
Pea Tyczynska
e22e7245fe Use sanitised pdfs for sending and handle invalid pdfs, details below:
- pass new, sanitised pdf for sending
- move invalid pdfs to a newly created bucket
- set status fro notifications that failed pdf validation to a new status validation-failed
- adjust existing tests
2018-10-16 17:30:35 +01:00
Leo Hemsted
2ed50e760f Revert "Celery 4" 2018-10-09 13:27:49 +01:00
Leo Hemsted
e0551547d2 wait 20 seconds between long polls 2018-10-03 14:11:30 +01:00
Leo Hemsted
1786f99400 construct celery queues once in the base config
previously, we were confusing things by appending to CELERY_QUEUES in
both dev and test configs - these are executed at import time, so the
list contained all queues twice, regardless of what config you're
actually using.

Fortunately, the -Q command that we supply the workers with overrides
this config option, so other environments weren't affected. Given that,
we can tidy up this code by just declaring it in the base config every
time
2018-10-03 14:11:30 +01:00
Pea Tyczynska
85e57025b2 Define MMG inbound credentials for local in Development environment 2018-09-25 15:10:16 +01:00
Rebecca Law
f1b04193ca In this PR we remove trigger-letter-pdfs-for-day scheduled task and just call collate_letter_pdfs_for_day instead.
There was a datetime bug in the query which resulted in files not being sent to the postal provider.
The trigger-letter-pdfs-for-day task is no longer needed, so rather than fix the query just call collate_letter_pdfs_for_day directly.
Less code is always better.

Deployment considerations: I realized this is strictly not backwards compatible if the scheduled job is in progress and a task is on the queue that no longer exists. This is ok since we will deploy this well before 17:50.
2018-09-12 17:16:34 +01:00
Athanasios Voutsadakis
1e4c2ee796 Disable header check on preview
This header was introduced to ensure that no traffic was being
directed straight to the .cloudapps.digital domain. This was especially
useful for the non-production environments where access to the proper
domains is allowed for specific IP addresses while the cloudapps.digital
ones are open.

Moving to paas custom domains [1] will allow us to stop using the paas
proxies and, as a result, unbind the cloudapps.digital domain from our
apps.

This means that the X-Custom-Forwarder will become obsolete since all
our requests will be coming directly to our domain (albeit through
cloudfront) so any IP restriction can be implemented with a route
service [2].

1: https://docs.cloud.service.gov.uk/deploying_services.html#set-up-a-custom-domain-using-the-cdn-route-service
2: https://docs.cloud.service.gov.uk/deploying_services.html#route-services
2018-08-23 13:23:14 +01:00
Katie Smith
a87be9b74a Use new value of SMS_CHAR_COUNT_LIMIT from utils
Admin, API and utils were all defining a value for SMS_CHAR_COUNT_LIMIT.
This value has been updated in notifications-utils to allow text
messages to be 4 fragments long and notifications-api now gets the value of
SMS_CHAR_COUNT_LIMIT from notifications-utils instead of defining it in
config.

Also updated some tests to check for the higher limit.
2018-08-16 16:34:34 +01:00
Pea Tyczynska
c0f309a2a6 Delete scheduled task to populate monthly_billing 2018-07-30 11:06:04 +01:00
Rebecca Law
317ab149f4 The periodic task to populate ft_notification_status was calling the wrong task, this fixes that. 2018-07-04 14:12:47 +01:00
Rebecca Law
1363244a2b Added the scheduling for the task 2018-06-20 16:48:03 +01:00
Leo Hemsted
897ab93148 zendesk instead of deskpro 2018-04-27 16:36:39 +01:00
Alexey Bezhan
204aaf172d Add a document download client
Allows uploading documents to the Document Download API.
The client is configured with an API host and auth token. There's
no need for a flag to disable the client in the test environments
at the moment since the upload is only triggered by a specific
payload which would only be sent with an explicit goal of using
document download.
2018-04-09 16:30:16 +01:00
Leo Hemsted
8e73961f65 add new redis template usage per day key
We've run into issues with redis expiring keys while we try and write
to them - short lived redis TTLs aren't really sustainable for keys
where we mutate the state. Template usage is a hash contained in redis
where we increment a count keyed by template_id each time a message is
sent for that template. But if the key expires, hincrby (redis command
for incrementing a value in a hash) will re-create an empty hash.

This is no good, as we need the hash to be populated with the last
seven days worth of data, which we then increment further. We can't
tell whether the hincrby created the key, so a different approach
entirely was needed:

* New redis key: <service_id>-template-usage-<YYYY-MM-DD>. Note: This
  YYYY-MM-DD is BTC time so it lines up nicely with ft_billing table
* Incremented to from process_notification - if it doesn't exist yet,
  it'll be created then.
* Expiry set to 8 days every time it's incremented to.

Then, at read time, we'll just read the last eight days of keys from
Redis, and sum them up. This works because we're only ever incrementing
from that one place - never setting wholesale, never recreating the
data from scratch. So we know that if the data is in redis, then it is
good and accurate data.

One thing we *don't* know and *cannot* reason about is what no key in
redis means. It could be either of:

* This is the first message that the service has sent today.
* The key was deleted from redis for some reason.

Since we set the TTL to so long, we'll never be writing to a key that
previously expired. But if there is a redis (or operator) error and the
key is deleted, then we'll have bad data - after any data loss we'll
have to rebuild the data.
2018-04-03 16:12:54 +01:00
Rebecca Law
0701b2546d Remove test crontab minute 2018-03-26 10:30:08 +01:00