notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2025-12-26 10:21:33 -05:00

Author	SHA1	Message	Date
David McDonald	a13b17a7c9	Transform list to json so it can be used read by json.loads	2020-02-21 13:41:08 +00:00
David McDonald	2dc5550159	Change variable name to make more descriptive Also remove unnecessary if statement Also add manifest change to make sure relevant environment variables makes it into the app	2020-02-20 15:48:15 +00:00
David McDonald	9de3a9ff43	Set healthcheck timeout as integer Not allowed to be a non integer so have upped to 3 rather than going to 2 (fairly arbitrary choice).	2019-12-19 10:55:38 +00:00
David McDonald	5aaf109ae1	Up API health check timeout to 2.5 seconds This was after we saw an instance of the API failing it's healthcheck even though it was still healthy enough to serve requests to users. This follows the change we've also made to template-preview and admin of upping the health check timeout. Unlike those where we set it to be 10 seconds, we have been less allowing here and only chosen 2.5 seconds. This was at suggestion of Toby from PaaS as the api should generally have quicker response times and more annoyance might be created for users if we let an instance stick around for 10 seconds where it was unable to serve requests successfully.	2019-12-18 10:36:04 +00:00
Leo Hemsted	4701e5d9af	don't define MMG_URL and FIRETEXT_URL in manifest these URLs never change, and it lead to surprising issues where an updated default MMG_URL wasn't actually respected on PaaS. These urls aren't private and don't need to be stored in credentials. By not defining them in the manifest, we expect them to use the default unless `cf set-env` has been specifically used to modify them in an app.	2019-12-04 15:26:49 +00:00
Leo Hemsted	e094dd4bfd	remove loadtesting from providers we don't use it since we wrote our own provider stubs for performance tests. this removes it from the api - it's still in the DB and will be retrieved by queries, but is set to disabled on prod	2019-10-23 11:45:07 +01:00
Leo Hemsted	9c2ded00c1	scale up api to 25 instances before deploy on production deploys take up to five minutes, during which notify-paas-autoscaler can't scale the app. We saw 502s due to a large volume of traffic coming in during that time, and we couldn't react cos we were deploying. scale up to 25 instances, the autoscaler won't be able to downscale until after the deploy has finished.	2019-10-14 16:12:35 +01:00
Leo Hemsted	3a0bf2b23e	Add reporting worker also remove references to unused statistics queue	2019-08-15 16:42:15 +01:00
Athanasios Voutsadakis	cd936d2e71	Enable statsd exporter for production Also bump the utils version to include a fix on the error handling logic when we fail to send a metric.	2019-08-14 11:42:13 +01:00
Andy Paine	088f234185	REP-340: Use PaaS statsd exporter - We are running the statsd-exporter on the PaaS now so we can use the internal UDP route to talk to it - Only update in preview and staging still so that we can get the dashboards fully up to date before switching prod	2019-08-05 10:36:58 +01:00
Andy Paine	57705fd6fe	AUTO: Explicitly include FIRETEXT_URL in manifest - We are explicit about MMG_URL but not FIRETEXT_URL - credentials has already been updated (checked by doing make generate-manifest for all envs)	2019-06-14 15:22:18 +01:00
Andy Paine	2d17827780	AUTO: Enable statsd exporter on staging - We want to do some load testing so we want to use the Prometheus metrics for observing the system - Roll out the statsd exporter work to staging too	2019-06-10 11:12:44 +01:00
Andy Paine	e61619f3e0	AUTO-413: Point preview statsd at tools - We are running a statsd exporter on tools to collect all our statsd metrics for scraping by Prometheus - Update preview to point there instead of at the local one which has issues with redeployment and DNS changing	2019-05-30 17:03:08 +01:00
Andy Paine	adf81ef689	BAU: Use port health checks for API - We've been seeing an issue when traffic spikes of the http health checks taking over 1s and PaaS killing the app - Port health checks won't care about being stuck in a queue so should continue to work even at high loads - We have functional tests to catch if a deployment brings up the app (and so passes port health check) but then doesn't work	2019-05-30 11:56:19 +01:00
Andy Paine	655d5a4e16	AUTO-413: Use an internal app for statsd preview - We are running statsd exporter as an app with a public route for Prometheus to scrape - This updates preview to send statsd metrics over the CF internal networking to the statsd exporter - Removes the sidecar statsd exporters too	2019-05-23 11:10:33 +01:00
Leo Hemsted	10a6f32a09	add routes for all apps all apps get a route assigned when using v3-zdt-push. > By default, the web process has a route and one instance. Other processes have zero instances by default. ([source](https://docs.cloudfoundry.org/devguide/multiple-processes.html)) When we push apps to multiple environments they need different routes or the second push will fail, so this means that we need to define routes ourselves for every app. We're also manually flagging the health-check as either "http" or "process" - http for the api, process for all others. If not specified, healthcheck is set to `port` by cloudfoundry - we've seen some issues with upgrading the deployment from v2 to v3 when using port - it adds apps to load balancer when they're not ready, which can result in 404s. by setting healthcheck to http it'll wait for the /status endpoint to return 200, which will wait for flask to get everything up and running properly	2019-05-15 16:01:28 +01:00
Leo Hemsted	7a711cf314	Revert "Zero downtime deploy"	2019-05-15 13:48:40 +01:00
Leo Hemsted	53806f168d	add routes for all apps all apps get a route assigned when using v3-zdt-push. > By default, the web process has a route and one instance. Other processes have zero instances by default. ([source](https://docs.cloudfoundry.org/devguide/multiple-processes.html)) When we push apps to multiple environments they need different routes or the second push will fail, so this means that we need to define routes ourselves for every app. We're also manually flagging the health-check as either "http" or "process" - http for the api, process for all others.	2019-05-09 14:37:03 +01:00
Leo Hemsted	fda3f4b41a	Revert "Stub out SMS providers on staging for the perf tests"	2019-05-07 15:32:35 +01:00
Pea Tyczynska	b59bca0fc2	Rename workers so they are less wordy xd	2019-05-01 14:51:43 +01:00
Pea Tyczynska	6163ca8b45	Change distribution of queues among notify delivery workers This is so that retry-tasks queue, which can have quite a lot of load, has its own worker, and other queues are paired with queues that flow similarly: - letter-tasks with create-letters-pdf-tasks - job-tasks with database-tasks	2019-04-30 12:03:06 +01:00
Alexey Bezhan	bc7d91daec	Merge pull request #2468 from alphagov/local-statsd-exporter Local statsd exporter	2019-04-24 15:27:55 +01:00
Alexey Bezhan	ba2abc9127	Add a `local_statsd` configuration to PaaS manifest template Running `statsd_exporter` alongside the app process allows us to get StatsD metrics pushed by workers to Prometheus. This requires adding a route to the worker instances and binding the RE prometheus discovery service. So this approach won't work for API and admin since they already have `gunicorn` bound to the `$PORT`. Since we're not ready to switch all apps to Prometheus metrics at once and we don't currently have a way to push statsd metrics to multiple destination we're using a configuration setting in the manifest template to switch individual workers in specific environments. `local_statsd` contains a list of environments where the app should use local `statsd_exporter` for pushing statsd metrics instead of HostedGraphite.	2019-04-24 13:50:13 +01:00
Alexey Bezhan	7520cc46de	Stub out SMS providers on staging for the perf tests This points MMG and Firetext on staging to a stub service run on PaaS to avoid text message costs during the load test.	2019-04-24 11:37:41 +01:00
Leo Hemsted	c786258a60	give instances and NOTIFY_APP_NAME sensible defaults NOTIFY_APP_NAME follows precedent and just tries to strip 'notify-' from the beginning of the string. instances is not specified at all if not defined - it'll scale up to the same amount of instances as currently present, and then the autoscaler will take over anyway	2019-04-11 14:57:22 +01:00
Leo Hemsted	7d9cd58e89	rename public-api to api we don't use the public-api bit anywhere - even cloudwatch overwrites based on CW_APP_NAME (which we can get rid of as this distinction is gone)	2019-04-10 15:19:46 +01:00
Leo Hemsted	66ca98fbfb	create manifest from jinja template newer versions of cf api don't allow you to have multiple apps per manifest file. So, instead of our current inheritance based model, move to the newer doc-dl/antivirus/template-preview approved jinja based model. the new single manifest.yml.j2 file sets a bunch of variables based on the CF_APP variable - things like NOTIFY_APP_NAME, default instances, etc. Then the manifest is built up to define all of the app options based on these defaults. Things default to sensible values, which can vary based on environment. When adding new environment variables, you'll need to add them to the manifest file. If they're json encoded lists, you'll need to pass them back to the `tojson` filter, or jinja2 will print them as python lists, with single quotes around strings.	2019-04-10 15:15:48 +01:00

27 Commits