Admin, API and utils were all defining a value for SMS_CHAR_COUNT_LIMIT.
This value has been updated in notifications-utils to allow text
messages to be 4 fragments long and notifications-api now gets the value of
SMS_CHAR_COUNT_LIMIT from notifications-utils instead of defining it in
config.
Also updated some tests to check for the higher limit.
Allows uploading documents to the Document Download API.
The client is configured with an API host and auth token. There's
no need for a flag to disable the client in the test environments
at the moment since the upload is only triggered by a specific
payload which would only be sent with an explicit goal of using
document download.
We've run into issues with redis expiring keys while we try and write
to them - short lived redis TTLs aren't really sustainable for keys
where we mutate the state. Template usage is a hash contained in redis
where we increment a count keyed by template_id each time a message is
sent for that template. But if the key expires, hincrby (redis command
for incrementing a value in a hash) will re-create an empty hash.
This is no good, as we need the hash to be populated with the last
seven days worth of data, which we then increment further. We can't
tell whether the hincrby created the key, so a different approach
entirely was needed:
* New redis key: <service_id>-template-usage-<YYYY-MM-DD>. Note: This
YYYY-MM-DD is BTC time so it lines up nicely with ft_billing table
* Incremented to from process_notification - if it doesn't exist yet,
it'll be created then.
* Expiry set to 8 days every time it's incremented to.
Then, at read time, we'll just read the last eight days of keys from
Redis, and sum them up. This works because we're only ever incrementing
from that one place - never setting wholesale, never recreating the
data from scratch. So we know that if the data is in redis, then it is
good and accurate data.
One thing we *don't* know and *cannot* reason about is what no key in
redis means. It could be either of:
* This is the first message that the service has sent today.
* The key was deleted from redis for some reason.
Since we set the TTL to so long, we'll never be writing to a key that
previously expired. But if there is a redis (or operator) error and the
key is deleted, then we'll have bad data - after any data loss we'll
have to rebuild the data.
The JobStatistics table is going to be deleted. There are currently
3 tasks which use the JobStatistics model via the Statistics DAO, so we
need to make sure that these tasks aren't being used before they are
deleted in a separate PR.
This commit deletes:
* The `create_initial_notification_statistic_tasks` function which gets
used to call the `record_initial_job_statistics` task.
* The `create_outcome_notification_statistic_tasks` function which gets
used to call the `record_outcome_job_statistics` task.
* And the scheduling of the `timeout-job-statistics` scheduled task.
* Added is_precompiled_letter method to letter/utils.py
* Added tests for letter/utils.py
* Added tests for the rest endpoint
* Moved the Precompiled name to a central location
* Added hidden field to the test method to create a template
This will continue to update the notification history for letter notifications.
We currently have an issue where the responses to letters from the provider is taking a long time.
This is due to the manual nature of their process.
Updating the status of the letter will still work if the notification has been purged.
Also turned back on the purge letter notification scheduled task.
There's no reason to have things that never change in environment.sh.
you'll want to update your environment.sh, then restart your shells
(`exec bash` or `exec zsh` etc)
This also changes the database to be set statically in the config, but
overridable from the command line if you need to - for example, jenkins
will override it with the dockerised postgres uri.
This is to address some errors we saw yesterday such as:
`sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10
reached, connection timed out, timeout 30`
Related flask-sqlalchemy docs:
http://flask-sqlalchemy.pocoo.org/2.3/config/#configuration-keys
PR #1550 added the rate_limit column to the Service table.
This PR removes the rate limits from the config and uses rate_limit from
the Service model instead. Rate limits are still separated into 'team',
'normal' and 'test', but these values are the same for a service.
Pivotal story https://www.pivotaltracker.com/story/show/153992529
During database upgrades and database fail overs there has been errors
because the database connection stays open, when a query is run the
query fails and the connection is re-established. To avoid these errors
shorter timeouts have been used to keep the connections from getting
stale.
- SQLALCHEMY_POOL_TIMEOUT timeout idle connections after 30 secs
- Updated SQLALCHEMY_POOL_RECYCLE to recycle the connection every 5 mins
See guide on optimistic disconnect handling - using the pool recycle
as a way to manage this:
http://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-optimistic
Grouping the letters into a maximum number of files is necessary because
the SQS task needs to be under a certain size. We also compress the task
when sending.
add collate-letter-pdfs task (name pending). This retrieves a list of
letter pdf files (just the metadata, not the actual data) from s3, and
loops through them, calling the ftp task zip-and-send-letter-pdfs. It
groups them up by adding them to lists while counting the total
filesize, if it gets over a certain filesize (currently set to 500mb)
it breaks at that chunk, sends off that list of files to the ftp app,
and then starts building up a new list.
DVLA have a hard 2gb limit on how big the zip files we can send is -
however we're going to be limited by the amount of memory on the ftp
app well before we get around to handling 2gb of pdf data - so the
limit is 500mb for now. We'll adjust it after we see how ftp performs.
SQL Alchemy config changes were made to decrease the downtime of the
application. The last test only had 1 min of downtime in the upgrade
period i.e. 40 mins. Tested without the config changes to double
check the change had the desired effect. Adding back in so we can test
the changes under load and performance test outside of upgrade.
SQL Alchemy config changes were made to decrease the downtime of the
application. The last test only had 1 min of downtime in the upgrade
period i.e. 40 mins. Reverting the changes so that the same process
can be followed to ensure the changes had the desired effect.