Commit Graph

477 Commits

Author SHA1 Message Date
Rebecca Law
891a80addf Added notification id to the log message for upload pdf to make it easier to search for the letter notification in the logs. 2018-03-01 10:37:07 +00:00
Rebecca Law
7faf375375 Merge pull request #1695 from alphagov/org-user-endpoints
Organisation user endpoints
2018-02-26 16:27:01 +00:00
Alexey Bezhan
8971a5adce Upload pre-compiled letter PDF to S3
Pre-compiled letter endpoint uploads PDF contents to S3 directly
instead of creating a letter task to generate PDF using template
preview.

This moves some of the utility functions used by existing letter
celery tasks to app.letters.utils, so that they can be reused by
the API endpoint.
2018-02-23 17:52:25 +00:00
Leo Hemsted
5b71d2f36e add org invite template to db 2018-02-23 10:45:18 +00:00
Leo Hemsted
a2a1c5e9af add organisation invite rest and dao 2018-02-23 10:45:18 +00:00
Rebecca Law
385653af44 Added a new exception type for DVLAException.
The Notify team needs to investigate when a notification is marked as failed.
We will process the whole file and mark the notifications with the appropriate status, if any are failed an exception is raised.
The exception will trigger a cloud watch error for the team to investigate.
2018-02-22 15:05:37 +00:00
Alexey Bezhan
8a67b714e2 Merge pull request #1651 from alphagov/eventlet-celery-workers-service-callbacks
Switch service callback workers to use eventlet pool implementation
2018-02-15 11:08:54 +00:00
Rebecca Law
e736c90d00 Switch to using the pdf letter flow.
When sending letters always use the pdf letter flow regardless of service permissions.
2018-02-13 18:38:32 +00:00
Alexey Bezhan
9eada23392 Release DB connection before executing service API callback
Flask-SQLAlchemy sets up a connection pool with 5 connections and will
create up to 10 additional connections if all the pool ones are in use.

If all connections in the pool and all overflow connections are in
use, SQLAlchemy will block new DB sessions until a connection becomes
available. If a session can't acquire a connections for a specified
time (we set it to 30s) then a TimeoutError is raised.

By default db.session is deleted with the related context object
(so when the request is finished or app context is discarded).

This effectively limits the number of concurrent requests/tasks with
multithreaded gunicorn/celery workers to the maximum DB connection pool
size. Most of the time these limits are fine since the API requests are
relatively quick and are mainly interacting with the database anyway.

Service callbacks however have to make an HTTP request to a third party.
If these requests start taking a long time and the number of threads is
larger than the number of DB connections then remaining threads will
start blocking and potentially failing if it takes more than 30s to
acquire a connection. For example if a 100 threads start running tasks
that take 20s each with a max DB connection pool size of 10 then first 10
threads will acquire a connection right away, next 10 tasks will block for
20 seconds before the initial connections are released and all other tasks
will raise a TimeoutError after 30 seconds.

To avoid this, we perform all database operations at the beginning of
the task and then explicitly close the DB session before sending the
HTTP request to the service callback URL. Closing the session ends
the transaction and frees up the connection, making it available for
other tasks. Making calls to the DB after calling `close` will acquire
a new connection. This means that tasks are still limited to running
at most 15 queries at the same time, but can have a lot more concurrent
HTTP requests in progress.
2018-02-13 16:44:30 +00:00
Alexey Bezhan
599e611278 Create an app context for each celery task run
Celery tasks require an active app context in order to use app logger,
database or app configuration values. Since there's no builtin support
we create this context by using a custom celery task class NotifyTask.

However, NotifyTask was using `current_app`, which itself is only
available within an app context so the code was pushing an initial
context in run_celery.py.

This works with the default prefork pool implementation, but raises
a "working outside of application context" error with an eventlet
pool since that initial  application context is local to a thread
and eventlet celery worker pool will create multiple green threads.

To avoid this, we bind NotifyTask to the app variable with a closure
and use that variable instead of `current_app` to create the context
for executing the task. This avoids any issues caused by shared
initial app context being lost when spawning additional worker threads.

We still need to keep the context push in run_celery.py for prefork
workers since it's required for logging events outside of tasks
(eg logging during `worker_process_shutdown` signal processing).
2018-02-13 16:44:30 +00:00
Leo Hemsted
1ef6718918 Merge pull request #1646 from alphagov/add-tags-to-pdf-letters
upload letter pdfs with retention tag
2018-02-12 15:03:53 +00:00
Leo Hemsted
093e8083e0 upload letter pdfs with retention tag
so we can delete them automatically with s3's lifecycle policy
2018-02-09 17:13:37 +00:00
Richard Chapman
632364633b Fixed typo in call, should be self.name not Task.name.
Task.name always returned None.
2018-02-08 17:47:59 +00:00
Rebecca Law
8aa829e93a Merge pull request #1622 from alphagov/reduce-logging
Reducing the logging for the life cycle of a notification
2018-02-08 16:57:21 +00:00
Richard Chapman
d855b4e4ec Removed statsd from the api and use the statsd in the utils library.
The statsd code was added to the utils library a while ago, uses the
statsd from the util library and therefore consolidates the code into
once place.
2018-02-06 09:52:15 +00:00
Richard Chapman
2d670e8cf0 Updated utils to the latest version. This version of utils has less
logging at info level and as such no longer prints out the celery task
timing which are found to be use to find out if a tasks has been called
but also the timing for the task. Added an extra timing message for
celery tasks so that it can be determined if the these are less frequent
than the API calls and provide more useful information
2018-02-05 14:58:02 +00:00
kentsanggds
4773e1b51b Merge pull request #1617 from alphagov/ken-set-page-count-0-fake-dvla-response
Update the dvla response data to 0 page count
2018-02-05 10:48:28 +00:00
Rebecca Law
dce79832ff As Notify matures we probably need less logging, especially to report happy path events.
This PR is a proposal to reduce the average messages we see for a single notification from about 7 messages to 2.

Messaging would change to something like this:
February 2nd 2018, 15:39:05.885	Full delivery response from Firetext for notification: 8eda51d5-cd82-4569-bfc9-d5570cdf2126
{'status': ['0'], 'reference': ['8eda51d5-cd82-4569-bfc9-d5570cdf2126'], 'time': ['2018-02-02 15:39:01'], 'code': ['000']}
February 2nd 2018, 15:39:05.885	Firetext callback return status of 0 for reference: 8eda51d5-cd82-4569-bfc9-d5570cdf2126
February 2nd 2018, 15:38:57.727	SMS 8eda51d5-cd82-4569-bfc9-d5570cdf2126 sent to provider firetext at 2018-02-02 15:38:56.716814
February 2nd 2018, 15:38:56.727	Starting sending SMS 8eda51d5-cd82-4569-bfc9-d5570cdf2126 to provider at 2018-02-02 15:38:56.408181
February 2nd 2018, 15:38:56.727	Firetext request for 8eda51d5-cd82-4569-bfc9-d5570cdf2126 finished in 0.30376038211397827
February 2nd 2018, 15:38:49.449	sms 8eda51d5-cd82-4569-bfc9-d5570cdf2126 created at 2018-02-02 15:38:48.439113
February 2nd 2018, 15:38:49.449	sms 8eda51d5-cd82-4569-bfc9-d5570cdf2126 sent to the priority-tasks queue for delivery

To somthing like this:
February 2nd 2018, 15:39:05.885	Firetext callback return status of 0 for reference: 8eda51d5-cd82-4569-bfc9-d5570cdf2126
February 2nd 2018, 15:38:49.449	sms 8eda51d5-cd82-4569-bfc9-d5570cdf2126 created at 2018-02-02 15:38:48.439113
2018-02-02 15:55:25 +00:00
Rebecca Law
cf8be03c5e Fix bug with sending deskpro tickets in production.
The NOTIFY_ENVIRONMENT variable is set to `production` from the run_paas_app script, but that is overwritten with `live` in the create_app function when starting an application.
Although this is confusing and it would be good to resolve that. It is a larger piece of work. For now I have included booth strings in the if condition, that way when we do migrate the code we will not have an issue with these two methods.
2018-02-02 11:27:58 +00:00
Ken Tsang
9122efc45e Update the dvla response data to 0 page count
- as the response is fake, the notifications billable_unit is left at 0, the fake dvla response should also be 0. Otherwise there will be confusing logs reporting mismatched page count and billable units which are just research ones.
2018-01-29 16:39:53 +00:00
Ken Tsang
377afc3ed4 Update letter processing for letter pdfs in research mode
- call create fake response file task if in research mode only on preview and development environments to not impact response files on staging and live
2018-01-26 15:04:56 +00:00
Ken Tsang
c14452d8b1 Add create fake letter response file
- used by letter functional tests to ensure that we can create the response file necessary to trigger the update to a delivered state
2018-01-26 14:45:36 +00:00
Ken Tsang
04beff7d05 Return sorted ACK files
- also fixes unit test failure due to random order of filenames
2018-01-26 14:45:36 +00:00
Ken Tsang
edd030e5dd Refactor code to move business logic
- DVLA specific logic is now moved to letters_pdf_tasks
2018-01-26 14:45:36 +00:00
Venus Bailey
82e34f7699 Merge branch 'master' into letter-deskpro-ack-mod 2018-01-23 11:32:49 +00:00
venusbb
d93b0d12d1 Minor change to raising deskpro ticket and errors when letter acknowledgement fails. 2018-01-23 09:51:43 +00:00
Katie Smith
84e25d6b98 Compare letter page count with billable units in DVLA response file
We compare the page_count in the response file we receive from the DVLA
with the billable_units of the letter. If these don't match, we log an
error.
2018-01-23 08:59:01 +00:00
Katie Smith
f1c75c5c5d Change notification status of failed letters
- Changed the notification status of letters for letters that DVLA marks
as 'failed' from NOTIFICATION_TECHNICAL_FAILURE to
NOTIFICATION_TEMPORARY_FAILURE.
2018-01-23 08:59:01 +00:00
venusbb
2edb0c0883 Grammar change to despro ticket
Grammar change to despro ticket
Strip sets of empty elements
Strip subfolder in returned zip file list
ar change to despro ticket
2018-01-22 14:51:06 +00:00
venusbb
cf30e69e8c Processing ack files only send alerts in production environment.
Processing ack files only send alerts in production environment.
Deskpro alerts include bucket names for debugging purpose.
2018-01-22 12:44:03 +00:00
venusbb
4ffb84de36 Comparison of date needs to use AWS S3 format 2018-01-19 09:24:03 +00:00
venusbb
6a790b59aa Merge branch 'master' of https://github.com/alphagov/notifications-api into letter-S3zipchange-deskpro 2018-01-18 14:45:04 +00:00
venusbb
99dda99890 Use set rather than list to compare ack file and zip files difference 2018-01-18 14:44:23 +00:00
venusbb
357ec3a7d5 Call Deskpro ticket when there is an error 2018-01-18 11:06:07 +00:00
venusbb
8f5a5f8105 Parse acknowledgement files against .ZIP.TXT created by ftp app.
- Also convert the files info to upper() for comparison rather than lower
because original file names are in upper case. The unit tests contain examples of the returned lists.
2018-01-18 10:44:36 +00:00
Alexey Bezhan
70e3b07308 Disable Deskpro letters still sending alert in preview and staging
Since preview and staging environments don't have a full DVLA
integration they're likely to contain letter notifications in
a 'sending' state. To avoid spamming Deskpro we skip the check
unless we're in a production or test environment.
2018-01-17 17:14:20 +00:00
Alexey Bezhan
29777c3123 Add a celery scheduled task to alert if letters are still sending
We should receive a response file from DVLA by 4pm the next working
day (next Monday for letters created on Friday, Saturday or Sunday).

Response file triggers a task to update the letters status from
'sending' to either 'failed' or 'delivered', at which point there
should be no letter notifications in the 'sending' state for that day.

To catch any errors in the process (eg a missing response file from
DVLA) we add a scheduled task that checks letter notifications for
previous day (or Friday when run on Monday) and raises a Deskpro
ticket if it finds any in a 'sending' state. We're checking letter
notifications based on the `sent_at` date, which is set when the
letter PDF is sent to DVLA (so for letters created after 5:30pm it
will be the next day).

The task runs at 4:30pm, which should give the response file processing
task enough time to finish if the file was uploaded at 4pm.
2018-01-17 15:35:16 +00:00
venusbb
9179802717 Fix a typo error on task argument
Modify unit test to be more robust
2018-01-17 12:21:56 +00:00
venusbb
a90596fa1b Merge branch 'master' of https://github.com/alphagov/notifications-api into raise-alert-when-no-ack-file 2018-01-16 16:20:40 +00:00
venusbb
cd2e98c388 Change datetime to use utc 2018-01-16 16:06:08 +00:00
venusbb
f273b23c25 Get ack files only from day before the ack file is received.
Take care of upper and lower case of file names and contents
Add a test for s3 get_list_of_files_by_suffix
2018-01-16 09:34:09 +00:00
Leo Hemsted
218fc5e14d only send letters in created state to ftp app for zipping
this means if we end up with some notifications sending and others not,
due to problems with the ftp connectivity for example, we don't re-send
those that worked.

As a reminder, letter pdf notifications start as created and stay that
way until we have sent the zip file to DVLA, at which point they are
updated to sending
 #
2018-01-15 17:00:00 +00:00
venusbb
24b785e7e0 Added process for dvla acknowledgement file
Daily schedule task to check ack file against zip file lists
if we haven't receive ack for a zip file, raise a 500 exception
2018-01-12 15:44:00 +00:00
Rebecca Law
9c4e43bfac Some pseudo code and notes of how to implement a check for the letter acknowledgement file. 2018-01-11 16:37:39 +00:00
Katie Smith
83711a9a0d Remove the delivery receipt callback for letters
This has been removed because the services only see one status.

Pivotal story: https://www.pivotaltracker.com/story/show/153674962
2018-01-03 16:41:10 +00:00
Katie Smith
644b110a8d Group letters into a max number of files for sending to DVLA
Grouping the letters into a maximum number of files is necessary because
the SQS task needs to be under a certain size. We also compress the task
when sending.
2018-01-03 11:31:22 +00:00
Leo Hemsted
9debc96c8c exclude non-pdfs from collate task
and fix celery kwargsg
2018-01-02 10:39:21 +00:00
Leo Hemsted
309b4d7d33 add collate-letter-pdfs task
add collate-letter-pdfs task (name pending). This retrieves a list of
letter pdf files (just the metadata, not the actual data) from s3, and
loops through them, calling the ftp task zip-and-send-letter-pdfs. It
groups them up by adding them to lists while counting the total
filesize, if it gets over a certain filesize (currently set to 500mb)
it breaks at that chunk, sends off that list of files to the ftp app,
and then starts building up a new list.

DVLA have a hard 2gb limit on how big the zip files we can send is -
however we're going to be limited by the amount of memory on the ftp
app well before we get around to handling 2gb of pdf data - so the
limit is 500mb for now. We'll adjust it after we see how ftp performs.
2018-01-02 10:39:21 +00:00
Ken Tsang
8103540261 Renamed run-letter-pdfs to trigger-letter-pdfs-for-day
- also set optional date_to_process argument for dao_get_count_of_letters_to_process_for_date to None, so it's set in the code instead
2017-12-19 13:23:55 +00:00
Ken Tsang
88f3c17463 Trigger collate-letter-pdfs-for-day task if notis to process 2017-12-19 13:23:55 +00:00