Commit Graph

240 Commits

Author SHA1 Message Date
Chris Hill-Scott
61c9e50ed9 Add public API endpoint to create emergency alerts
We know there is at least one system which wants to integrate with
Notify to send out emergency alerts, rather than creating them manually.

This commit adds an endpoint to the public API to let them do that.

To start with we’ll just let the system create them in a single call,
meaning they still have to be approved manually. This reduces the risk
of an attacker being able to broadcast an alert via the API, should the
other system be compromised.

We’ve worked with the owners of the other system to define which fields
we should care about initially.
2021-01-26 16:24:44 +00:00
Pea Tyczynska
95deb5a52f Move DATETIME_FORMAT from app to app.utils
To avoid cyclical import issues
2020-12-18 17:39:35 +00:00
Leo Hemsted
087cc5053d separate cbc proxy into separate clients
this is a pretty big and convoluted refactor unfortunately.

Previously:

There was one global `cbc_proxy_client` object in apps. This class has
the information about how to invoke the bt-ee lambda, and handles all
calls to lambda. This includes calls to the canary too (which is a
separate lambda).

The future:

There's one global `cbc_proxy_client`. This knows about the different
provider functions and lambdas, and you'll need to ask this client for a
proxy for your chosen provider. call cbc_proxy_client.get_proxy('ee')`
and it'll return you a proxy that knows what ee's lambda function is,
how to transform any content in a way that is exclusive to ee, and in
future how to parse any response from ee.

The present:

I also cleaned up some duplicate tests.
I'm really not sure about the names of some of these variables - in
particular `cbc_proxy_client` isn't a client - it's more of a java style
factory, where you call a function on it to get the client of your
choice.
2020-11-19 15:50:37 +00:00
Leo Hemsted
732c203d3e rename clients to notification_provider_clients
i think it's causing havoc with my attempts to mock stuff in the
`app.clients` directory because it's also accessible at that path. the
name's super vague and doesn't explain what it is anyway
2020-11-17 13:34:58 +00:00
Toby Lorne
efa6fb0bd9 clients: explicitly declare cbc_proxy_client
it is global rather than local but python cannot infer this and we get
UnboundLocalError

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
2020-10-20 13:01:17 +01:00
Toby Lorne
33ea75930a clients: add cbc proxy clients
We are going to invoke a lambda to send a message to the CBC

We need a CBC Proxy Client to do this

The Client will be able to send/update/cancel broadcasts in the CBC

Unless we have configured the app with AWS credentials for the
CBCProxyClient, we just want to use a client that does nothing: the noop
client

The AWS access keys are separate for the CBC Proxy vs other Notify AWS
things because the CBC Proxy lives in another AWS account

Signed-off-by: Toby Lorne <toby.lornewelch-richards@digital.cabinet-office.gov.uk>
Co-authored-by: Pea <pea.tyczynska@digital.cabinet-office.gov.uk>
Co-authored-by: Katie <katie.smith@digital.cabinet-office.gov.uk>
2020-10-20 11:23:16 +01:00
Rebecca Law
da4efbbb82 Catch and log any exception thrown in the checkin event method.
We don't want an exception while recording metrics to affect a user action. A KeyError exception was thrown today, that meant that a user say a 500, the action being performed was to download a document from the document-download-frontend app. By catching the error we prevent the user from seeing a 500 when a recording the connection metric fails.

Also catch the exception in the checkout event.
2020-09-10 12:25:35 +01:00
David McDonald
34538bcad8 Fix risk of uncaught exceptions due to gds metrics
Similar to https://github.com/alphagov/notifications-admin/pull/3510

Because of https://github.com/alphagov/gds_metrics_python/pull/8
2020-07-10 10:37:09 +01:00
Leo Hemsted
67f6dcae45 add broadcast message crud
new blueprint `/service/<id>/broadcast-message` with the following
endpoints:

* GET / - get all broadcast messages for a service
* GET /<id> - get a single broadcast message
* POST / - create a new broadcast message
* POST /<id> - update an existing broadcast message's data
* POST /<id>/status - move a broadcast message to a new status

I've kept the regular data update (eg personalisation, start and end
times) separate from the status update, just to keep separation of
concerns a bit more rigid, especially around who can update.

I've included schemas for the three POSTs, they're pretty
straightforward.
2020-07-09 14:19:56 +01:00
Rebecca Law
1be479a15a Merge branch 'master' into clean-up 2020-06-30 09:05:12 +01:00
Rebecca Law
7fb3b3db18 Small changes to tidy up the code 2020-06-30 09:04:24 +01:00
David McDonald
12f460adc5 Turn off statsd wrapper for synchronous statsd calls during POSTs
This commit turns off StatsD metrics for the following
- the `dao_create_notification` function
- the `user-agent` metric
- the response times and response codes per flask endpoint

This has been done with the purpose of not having the creation of text
messages or emails call out to StatsD during the request process. These
are the three current metrics that are currently called during the
processing of one of those requests and so have been removed from the
API.

The reason for removing the calls out to StatsD when processing a
request to send a notification is that we have seen two incidents
recently affected by DNS resolution for StatsD (either by a slow down in
resolution time or a failure to resolve). These POST requests are our
most critical code path and we don't want them to be affected by any
potential unforeseen trouble with StatsD DNS resolution.

We are not going to miss the removal of these metrics.
- the `dao_create_notification` metric is rarely/never looked at
- the `user-agent` metric is rarely/never looked at and can be got from
  our logs if we want it
- the response times and response codes per flask endpoint are already
  exposed using the gds metrics python library

I did not remove the statsd metrics from any other parts of the API
because
- As the POST notification endpoints are the main source of web traffic,
  we should have already removed most calls to StatsD which should
  greatly reduce the chance of their being any further issues with
  DNS resolution
- Some of the other metrics still provide value so no point deleting
  them if we don't need to
- The metrics on celery tasks will not affect any HTTP requests from
  users as they are async and also we do not currently have the
  infrastructure set up to replace them with a prometheus HTTP endpoint that
  can be scraped so this would require more work
2020-06-29 12:40:22 +01:00
Rebecca Law
ce32e577b7 Remove the use of schedule_for in post_notifications.
Years ago we started to implement a way to schedule a notification. We hit a problem but we never came up with a good solution and the feature never made it back to the top of the priority list.

This PR removes the code for scheduled_for. There will be another PR to drop the scheduled_notifications table and remove the schedule_notifications service permission

Unfortunately, I don't think we can remove the `scheduled_for` attribute from the notification.serialized method because out clients might fail if something is missing. For now I have left it in but defaulted the value to None.
2020-06-24 14:54:40 +01:00
David McDonald
8b4a424df1 Tidy up 2020-06-12 16:51:44 +01:00
Leo Hemsted
cd9b80f415 set test_errors app fixture to session scope
we have one global metrics variable `metrics = GDSMetrics()`, and we
then call `metrics.init_app` from within the flask application set up.
The v2/test_errors.py app_for_test fixture calls create_app, would call
metrics.init_app multiple times for the same metrics instance. This
causes errors, so change the fixture to session level so it only calls
once per test run.
2020-06-12 14:52:22 +01:00
Leo Hemsted
c4dc0f64c5 dd sqlalchemy connection metrics for celery tasks
grab the worker app name and task name rather than the web host and
endpoint. also add a fallback for if we're not in a web request or a
celery task. I think that'll probably happen when we use alembic, or if
we do things from within flask shell
2020-06-12 14:52:22 +01:00
Leo Hemsted
6e32ca5996 add prometheus metrics for connection pools (both web and sql)
add the following prometheus metrics to keep track of general app
instance health.

 # sqs_apply_async_duration

how long does the actual SQS call (a standard web request to AWS) take.
a histogram with default bucket sizes, split up per task that was
created.

 # concurrent_web_request_count

how many web requests is this app currently serving. this is split up
per process, so we'd expect multiple responses per app instance

 # db_connection_total_connected

how many connections does this app (process) have open to the database.
They might be idle.

 # db_connection_total_checked_out

how many connections does this app (process) have open that are
currently in use by a web worker

 # db_connection_open_duration_seconds

a histogram per endpoint of how long the db connection was taken from
the pool for. won't have any data if a connection was never opened.
2020-06-12 14:52:22 +01:00
Pea Tyczynska
8b7a0b88cb Ensure that aws ses stub client is not run in production 2020-06-04 15:43:47 +01:00
Pea Tyczynska
6422a88c8c Stub SES email client to avoid hitting SES during load testing
If we set an environment variable, we can stub out calls to SES and send
them to our own stub app. If the environment variable is not set, things
work as normal.

To be used alongside
https://github.com/alphagov/notifications-email-provider-stub
2020-06-03 11:11:43 +01:00
David McDonald
1ff52bbaad Add GDSMetrics package
As per instructions https://github.com/alphagov/gds_metrics_python

The celery workers don't have an HTTP endpoint so no point in trying to
get prometheus to scrape them.
2020-04-20 18:39:45 +01:00
Katie Smith
64c2061baa Use encryption module from utils
Now that the encryption module has been moved from this app to utils, we
can remove it from here (along with its tests) and import it from utils
instead. This also renames the `encryption.py` file to `hashing.py`,
since it no longer contains the encryption class.
2020-01-24 13:18:37 +00:00
Rebecca Law
63bdb0208b Create a new constant for datetime formats without timezone 2019-12-27 10:27:59 +00:00
David McDonald
203e19bef3 Add uploads blueprint, the endpoint returns a combination of uploaded letters and jobs. The endpoint returns data about the uploaded letter or job, including notification statistics for the upload. The data is ordered by scheduled for and created_at.
It is likely this endppoint will need additional data for the UI to display, for the first iteration this will enable the /uploads page to show both letters and jobs. Only letter uploaded by the UI are included in the resultset.

Add file name to resultset.
2019-12-06 09:54:51 +00:00
Leo Hemsted
e094dd4bfd remove loadtesting from providers
we don't use it since we wrote our own provider stubs for performance
tests.

this removes it from the api - it's still in the DB and will be
retrieved by queries, but is set to disabled on prod
2019-10-23 11:45:07 +01:00
Rebecca Law
8fdf700b90 Remove print statement 2019-08-19 14:51:21 +01:00
Rebecca Law
b37de7785c Remove the organisation_to_service table.
This table is no longer used or referenced in the code.
We can remove this mapping table now that the organisation to service relationship is handled by the foreign key in Services.
2019-08-15 15:17:53 +01:00
Katie Smith
3b670a0dbe Delete unused delivery blueprint 2019-08-12 10:54:11 +01:00
Leo Hemsted
2c2eb2abad handle 405s
Handle werkzeug http errors separately. Also, improve the generic
`Exception` handler to be more flexible when exceptions don't look
like http errors.

note: because they're 405s the route isn't matched, so the v2 error
handlers don't trip properly. this might be an issue because the
internal endpoints expect a different return format. Might have to
think about how to handle this.
2019-07-26 13:26:20 +01:00
Leo Hemsted
53ecaa3230 remove dvla_organisation
it's been superceeded by letter branding
2019-02-13 15:02:18 +00:00
Rebecca Law
7ee1d67df7 Added endpoints for letter-branding. 2019-01-24 17:39:48 +00:00
Rebecca Law
f8eb72a537 Adding rest endpoints for letter-branding 2019-01-24 16:38:52 +00:00
Alexey Bezhan
4a26ee1813 Set statement timeout on all DB connections
A recent issue with a long-running query (#2288) highlighted the
fact that even though the original HTTP connection might be closed
(for example after gorouter timeout of 15 minutes, which returns a
504 response to the client), the request worker will not be stopped.

This means that the worker is spending time and potentially DB
resources generating a response that will never be delivered.

Gunicorn's timeout setting only applies to sync workers and there
doesn't seem to be an option to interrupt individual requests in
gevent/eventlet deployments.

Since the most likely (and potentially most dangerous) scenario for
this is a long-running DB query, we can set a statement timeout on
our DB connections. This will raise a sqlalchemy.exc.OperationalError
(wrapping psycopg2.extensions.QueryCanceledError), interrupting the
request after the given timeout has been reached.

This is a Postgres client setting, so the database itself will abort
the transaction when it reaches the set timeout.

Since this will also apply to our celery tasks (including potentially
long-running nightly tasks) we set a timeout of 20 minutes to begin
with.

This can potentially be split in the future to set a different value
for each app, so that we could limit API requests even more.
2019-01-09 14:36:50 +00:00
Alexey Bezhan
b17fd21bb8 Revert "Enable pessimistic DB connection disconnect handling"
Once the DB upgrade is complete we no longer want the added
overhead of "pre-ping" connection check.
2018-11-30 16:20:08 +00:00
Alexey Bezhan
614a2dae2c Enable pessimistic DB connection disconnect handling
By default, SQLAlchemy will start a transaction with an
existing connection without checking that the connection
is still valid.

Enabling "pre-ping" makes the ORM send a `SELECT 1` when
acquiring a connection, which should help avoid some errors
caused by connections breaking during a DB failover.

The added statement has a constant overhead for all transactions,
so we should only keep it enabled when we're planning to switch
or upgrade the database server.

https://docs.sqlalchemy.org/en/latest/core/pooling.html#disconnect-handling-pessimistic
2018-11-30 15:41:58 +00:00
Leo Hemsted
fbe34041d6 add template folder CRUD
* create template folder
* rename template folder
* get list of template folders for service (not nested/presented in any
  particular way)
* delete template folder

Also removed `lazy=dynamic` from the `template_folder.templates`
relationship. lazy=dynamic returns a query object (which you can then
filter further). We just want to return the entire fetched list, at
least for now.
2018-10-31 14:28:16 +00:00
Katie Smith
4b030b1583 Create platform-stats blueprint
Created a platform-stats blueprint and moved the new platform stats
endpoint to the new blueprint (it was previously in the service
blueprint). Since the original platform stats route and the new platform
stats route are now in different blueprints, their view functions can
have the same name without any issues.
2018-06-29 16:01:45 +01:00
Rebecca Law
64f077f2f4 New endpoint to return data for complaints. 2018-06-05 14:25:24 +01:00
Chris Hill-Scott
cefa253578 Remove monotonic
> On Python 3.3 or newer, monotonic will be an alias of time.monotonic
> from the standard library. On older versions, it will fall back to an
> equivalent implementation.

– https://pypi.org/project/monotonic/
2018-05-04 10:56:51 +01:00
Leo Hemsted
897ab93148 zendesk instead of deskpro 2018-04-27 16:36:39 +01:00
Alexey Bezhan
204aaf172d Add a document download client
Allows uploading documents to the Document Download API.
The client is configured with an API host and auth token. There's
no need for a flag to disable the client in the test environments
at the moment since the upload is only triggered by a specific
payload which would only be sent with an explicit goal of using
document download.
2018-04-09 16:30:16 +01:00
Athanasios Voutsadakis
463f1eefaf Move proxy header check to auth-requiring endpoints
The main drive behind this is to allow us to enable http healthchecks on
the `/_status` endpoint. The healthcheck requests are happening directly
on the instances without going to the proxy to get the header properly
set.

In any case, endpoints like `/_status` should be generally accessible by
anything without requiring any form of authorization.
2018-03-27 17:37:09 +01:00
Rebecca Law
12046ee85a There endpoint to check the token of an invitation for services and organisations have been merged.
This PR deletes the old endpoints.
2018-02-27 13:46:23 +00:00
Rebecca Law
13ef2d7bae - new endpoint to check the token for an org invitation.
- new endpoint to add user to organisation
- new endpoint to return users for an organisation
2018-02-23 10:45:18 +00:00
Leo Hemsted
0d9aa5c531 add schema and hook up blueprint 2018-02-23 10:45:18 +00:00
Katie Smith
5eee0cf78b Merge pull request #1639 from alphagov/create-new-organisation-model
Create new organisation model
2018-02-12 12:02:58 +00:00
Katie Smith
269923ba28 Add Organisations endpoints
As part of this we also needed to add:
- schemas for validation
- serialize method for Organisation model
2018-02-08 15:22:21 +00:00
Leo Hemsted
08c35b3c72 downgrade lots of routine logging from error/exception to info
most of them are 400s for badly inputted phone numbers etc
2018-02-08 13:38:32 +00:00
Leo Hemsted
ba20010f27 remove organisation from api 2018-02-07 11:39:33 +00:00
Leo Hemsted
2f79da8702 create, edit and use email branding instead of organisation
notable things that have been kept until migration is complete:

* passing in `organisation` to update_service will update email branding
* both `/email-branding` and `/organisation` hit the same code
* service endpoints still return organisation as well as email branding
2018-02-06 11:23:34 +00:00
Richard Chapman
2d670e8cf0 Updated utils to the latest version. This version of utils has less
logging at info level and as such no longer prints out the celery task
timing which are found to be use to find out if a tasks has been called
but also the timing for the task. Added an extra timing message for
celery tasks so that it can be determined if the these are less frequent
than the API calls and provide more useful information
2018-02-05 14:58:02 +00:00