notifications-api

mirror of https://github.com/GSA/notifications-api.git synced 2026-03-01 14:29:51 -05:00

Author	SHA1	Message	Date
Katie Smith	e737217c3f	Merge pull request #3208 from alphagov/check-bcast-service-live Check if service is in live mode before sending a broadcast	2021-04-16 09:11:53 +01:00
Ben Thorner	be02573147	Fix apply_async not working with positional kwargs Celery's apply_async function accepts 'kwargs' as (get ready to be confused) either a positional argument, or a keyword argument: Positional: apply_async(['args'], {'kw': 'args'}) Keyword: apply_async(args=['args'], kwargs={'kw': 'args'}) We rely on the positional form in at least one place [1]. This fixes the overload of apply_async to cope with both forms, and continue to pass through any other (confusion time again) keyword args to super(), such as queue="queue". Note that we've also decided to stop accepting other positional args, since this is unnecessarily confusing, and we don't currently rely on it in our code. This stops it creeping in in future. [1]: `fde927e00e/app/job/rest.py (L186)`	2021-04-15 17:21:21 +01:00
Katie Smith	59978fd99a	Check if a service is live before sending a broadcast We only want to send a broadcast if the broadcast message is not stubbed and the service is live at the point at which the broadcast event should be created. This is to prevent the situation where a broadcast service is switched to live / trial mode in between the message being created and approved (we log an error if this happens). A stubbed broadcast message with a trial mode service at the point of approval is not an issue - trial mode services can approve their own broadcasts. In this situation, we don't create the broadcast event but also don't need to log an error.	2021-04-15 15:01:32 +01:00
Katie Smith	df82b514d1	Don't create broadcast event unless necessary If we're not going to send a broadcast, we don't need to create the BroadcastEvent in the database. The BroadcastMessage contains all the data we need - the BroadcastEvent is not used. Not creating the event when we won't send the broadcast (e.g. when the broadcast message was created when the service was in trial mode) adds an extra layer of security.	2021-04-15 14:55:45 +01:00
Ben Thorner	fde927e00e	Merge pull request #3205 from alphagov/celery-consistency-tweaks Small refactor for new Celery / StatsD code	2021-04-15 11:40:42 +01:00
Ben Thorner	f85dad5acf	Merge pull request #3203 from alphagov/remove-statsd-decorators Remove redundant @statsd timing decorators	2021-04-14 10:04:04 +01:00
Ben Thorner	5eb265138b	Remove unnecessary statsd_client parameter It turns out this is available from the app object [1], and we were already assuming this in the tests. [1]: `48c6c822e8/notifications_utils/clients/statsd/statsd_client.py (L52)`	2021-04-13 15:12:55 +01:00
Ben Thorner	ec6d87cd0f	Simplify argument passing in apply_async This avoids the need to keep in-sync with any future changes to the signature, and reduces the amount of irrelevant code to read.	2021-04-13 15:12:45 +01:00
David McDonald	2e6d761691	Merge pull request #3204 from alphagov/broadcast-envars Broadcast envars	2021-04-12 17:25:15 +01:00
David McDonald	295162c81d	Move CBC proxy enable check This change will make our development environments closer to production even if they aren't hooked up to the CBC proxy lambda functions. Now in development, we will create the broadcast event and create tasks for each broadcast provider event. We will still not create actual broadcast provider message rows in the DB and talk to the CBC proxies. This should be helpful in development to catch any issues we introduce to do with sending broadcast messaging. In time we may wish to have some fake CBC proxies in the AWS tools account that we can interact with to make it even more realistic.	2021-04-12 17:05:41 +01:00
Ben Thorner	e3e067c795	Remove redundant @statsd timing decorators These are superseded by timing task execution generically in the NotifyTask superclass [1]. Note that we need to wait until we've gathered enough data under the new metrics before removing these. [1]: https://github.com/alphagov/notifications-api/pull/3201#pullrequestreview-633549376	2021-04-12 15:19:18 +01:00
Ben Thorner	3e507eea55	Merge pull request #3201 from alphagov/revamp-celery-stats Migrate towards new metrics for Celery tasks	2021-04-12 15:04:37 +01:00
David McDonald	514afeb6f3	Set `CBC_PROXY_ENABLED` per environment, not dynamically Previously we looked at whether an environment was given AWS access keys to decide if the `CBC_PROXY_ENABLED` setting was true. Given that all environments (apart from development) are currently hooked up to our AWS cell broadcast accounts, it doesn't feel too useful to have a dynamic switch when we can just hardcode it. On top of that, this lays the groundwork for having `CBC_PROXY_ENABLED` to be True even if an individual application doesn't have the CBC PROXY aws access keys as in future only the broadcasts worker will have the AWS keys but all the other apps will know that cell broadcasting is indeed turned on for that environment.	2021-04-09 11:56:00 +01:00
Katie Smith	c3d9aca43a	Remove redundant comment We no longer have a noop client	2021-04-09 11:54:32 +01:00
Ben Thorner	ab8dd6d52c	Duplicate metrics to StatsD for Celery tasks Previously we used a '@statsd' decorator to time and count Celery tasks [1]. Using a decorator isn't ideal since we need to remember to add it to every task we define. In addition, it's not possible to use data like the task name and queue. In order to avoid breaking existing stats, this duplicates them as new StatsD metrics until we have sufficient data to update dashboards using the old ones. Using the CeleryTask superclass to send metrics avoids a future maintenance overhead, and means we can include more useful data in the StatsD metric. Note that the new metrics will sit in StatsD until we add a mapping for them [2]. StatsD automatically produces a 'count' stat for timing metrics, so we don't need to increment a separate counter for successful tasks. [1]: `dea5828d0e/app/celery/tasks.py (L65)` [2]: https://github.com/alphagov/notifications-aws/blob/master/paas/statsd/statsd-mapping.yml	2021-04-08 18:02:53 +01:00
Ben Thorner	248f5a0708	Include queue name in Celery task logs This is mainly so we can use it in the new metrics we send to StatsD in the following commits, but it should also be useful in the logs. I've taken the opportunity to make the log format consistent between success / failure, and with our Template Preview app [1]. [1]: `f456433a5a/app/celery/celery.py (L19)`	2021-04-08 18:02:51 +01:00
Ben Thorner	19be4faf45	Switch to monotonic time for task logs This matches the approach we take in utils [1]. Monotonic time is better because it avoids weird negative results due to clock shift. [1]: `5d18ebd796/notifications_utils/statsd_decorators.py (L14)`	2021-04-08 13:00:24 +01:00
Ben Thorner	054205835b	Remove unused metric for SQS apply duration This was added as part of a wider performance investigation [1]. I checked with Leo, who made the change, and while the other metrics are still be useful, there's no reason to keep this one. [1]: `6e32ca5996 (diff-76936416943346b5f691dac57a64acebc6a1227293820d1d9af4791087c9fb9eR23)`	2021-04-08 13:00:21 +01:00
Leo Hemsted	4a5b1c23bd	only send zendesk P1 for alerts we don't need to be re-notified when someone clicks cancel	2021-04-08 12:22:18 +01:00
Leo Hemsted	9bd8c0239c	look for 'live', not 'production' config['NOTIFY_ENVIRONMENT'] is hardcoded to `'live'` in the Live config class. The values as seen on the environment which we send real messages from: ``` >>> json.loads(os.environ['VCAP_APPLICATION'])['space_name'] # what cloudfoundry sets 'production' >>> os.environ['NOTIFY_ENVIRONMENT'] # we set this from cloudfoundry 'production' >>> current_app.config['NOTIFY_ENVIRONMENT'] # hardcoded in the Live config 'live' >>> current_app.config['NOTIFICATION_QUEUE_PREFIX'] # pulled from env var of same name 'live' >>> current_app.config['ENV'] # this is an unrelated flask variable 'production' ```	2021-04-08 12:17:22 +01:00
David McDonald	42b3f13538	Don't send real broadcasts for preview training mode We previously allowed MNOs to approve a broadcast themselves in training mode and have it go out to their integration environment as per https://github.com/alphagov/notifications-api/pull/3114 However, we want to remove this use case as it means we have to support configuration for training mode services to do things like pick a channel and send out alerts which we definteily don't want to do in production. By making this change, we reduce the chance of a single bug meaning an alert will go out in prod that shouldn't. Note, will also make it harder for development environment testing but I think it is still worth it as https://www.pivotaltracker.com/story/show/177584959 will make it much harder in our code to allow some environments to send alerts whilst in training mode.	2021-04-06 14:21:57 +01:00
Leo Hemsted	df393e36c5	send a p1 when a broadcast goes out on production it's important to keep tabs on when these things leave our system. Sending a zendesk ticket that triggers a P1 is probably our simplest way of notifying the team when this happens (it's what we do with out of hours emergencies on the admin app too). We don't have any direct pagerduty integrations from the api app, but we already have the zendesk client hooked up. After broadcasts go live, we may want to change this to a P2 (but even then, there's arguments for keeping it P1 to start with I think). Don't cause a P1 if it goes out on staging as that might be MNOs testing.	2021-04-06 11:32:19 +01:00
Ben Thorner	86b6217cf3	Log service ID for invalid inbound SMS This could help identify issues with inbound SMS for a service.	2021-03-31 10:22:34 +01:00
Rebecca Law	220047628b	Merge pull request #3193 from alphagov/populate-annual-billing-2021 Populate annual billing for 2021	2021-03-31 07:58:16 +01:00
Rebecca Law	0165ed9cda	use true as the default.	2021-03-30 12:46:28 +01:00
Rebecca Law	da8a7a8db1	Avoid key errors by setting the year_start with 2020 or 2021 Remove db.create_service_with_organisation method Update comment in command	2021-03-30 09:08:04 +01:00
Katie Smith	3bfd084a77	Simplify send_delivery_status_to_service code Now that https://github.com/alphagov/notifications-api/pull/3184 has been deployed for a while, the `send_delivery_status_to_service` task will always have `template_id` and `template_version` being passed in. This means we don't need to check if those fields are there.	2021-03-30 08:50:42 +01:00
Rebecca Law	7da5abc17b	The free sms allowances are changing for the financial year starting April 1 2021. In this PR there is a command to set annual_billing for all active services with the the new defaults. The new method `set_default_free_allowance_for_service` will also be called in a PR to follow that will set a services free allowance to the default if the organisation for the service is changed.	2021-03-29 13:32:00 +01:00
David McDonald	6d410daae4	Remove the emergency alerts canary See https://github.com/alphagov/notifications-broadcasts-infra/pull/197 for why we no longer need this and we get to delete some code!	2021-03-26 18:31:53 +00:00
Rebecca Law	057c4e4568	Quick fix to ensure that billing doesn't fail if the crown is not set for the service. The letters rates for cronw and non crown are the same. It would be nice to remove the need for crown but for now this is a quick fix.	2021-03-25 08:42:46 +00:00
Pea Tyczynska	52c529ab3a	Use personalisation to set client_reference for letters which were sent through Notify interface only. This is done to avoid performance dip from additional operation for other notification types.	2021-03-24 14:55:10 +00:00
Pea Tyczynska	a2da8bc070	If client reference not given, try to get it from personalisation. This is mostly useful for letters. For templated letters sent via interface, whether one-offs or CSV uploads, we do not give our users a way to set client reference. Still, they often have a placeholder with reference that we could use to set client_reference field. Why is this helpful? When letter is returned, or when we experience some printing issues, often it is difficult to identify letters after the retention period. This change will make it easier for some users to identify letters. It will have more impact if we inform our users of this in template editing guidance.	2021-03-24 14:55:10 +00:00
Ben Thorner	b2b14f39a3	Merge pull request #3183 from alphagov/remove-crown-letter-filename Remove non/crown indicator in letter filenames	2021-03-24 13:06:58 +00:00
Katie Smith	27b3cece7d	Send template id and version with delivery status callback This adds the `template_id` and `template_version` fields to the data sent to services from the `send_delivery_status_to_service` task. We need to account for the task not being passed these fields at first since there might be tasks retrying which don't have that data. Once all tasks have been called with the new fields we can then update the code to assume they are always there. Since we only send delivery status callbacks for SMS and emails, I've removed the tests where we call that task with letters.	2021-03-24 10:55:45 +00:00
Pea Tyczynska	04525dc8c1	Billing report only has services with bills to pay	2021-03-19 16:50:01 +00:00
Pea Tyczynska	dd278a0567	Rename get_usage_for_all_services to get_data_for_billing_report That describes what it does more correctly. As it does not include services with no usage.	2021-03-19 16:50:01 +00:00
Pea Tyczynska	4c3d70fd55	Update usage endpoint with billing details for orgs and services	2021-03-19 16:49:48 +00:00
Ben Thorner	8219b3c032	Remove non/crown indicator in letter filenames This is not required by DVLA and since [1] we no longer care about the end of letter filenames when collating them, so removing it is safe to do. Note that the name of the ZIP files of collated letters is based on a hash of the filenames, which needed updating in tests. Before merging this we need to do a test run in Staging, so DVLA can check that a mixture of the old / new filenames won't cause issues. [1]: https://github.com/alphagov/notifications-api/pull/3172	2021-03-18 13:05:12 +00:00
Katie Smith	3b78f863d5	Check for incomplete pending jobs We have a scheduled task that was checking for jobs still in progress. We saw a case where a scheduled job was stuck in a `pending` status as a result of an app shutting down. This changes the `check_job_status` task so that it also checks for scheduled jobs which are still pending after 30 minutes.	2021-03-18 08:24:36 +00:00
Ben Thorner	d17a6c8dbe	DRY-up pre-existing method to get PDF body/meta This is possible now that the "find_..." method returns an S3 object.	2021-03-16 12:53:14 +00:00
Ben Thorner	c76e789f1e	Reduce extra S3 ops when working with letter PDFs Previously we did some unnecessary work: - Collate task. This had one S3 request to get a summary of the object, which was then used in another request to get the full object. We only need the size of the object, which is included in the summary [1]. - Archive task. This had one S3 request to get a summary of the object, which was then used to make another request to delete it. We still need both requests, but we can remove the S3.Object in the middle. [1]: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#objectsummary	2021-03-16 12:53:13 +00:00
Ben Thorner	ff7eebc90a	Simplify deleting old letters Previously we made a call to S3 to list objects for a letter, even though we already had the precise key of the single object to hand. This removes the one usage of "get_s3_bucket_objects" and uses the filename directly in the call to remove the object.	2021-03-15 17:18:20 +00:00
Leo Hemsted	6784ae62a6	Raise Exception if letter PDF not in S3 Previously, the function would just return a presumed filename. Now that it actually checks s3, if the file doesn't exist it'll raise an exception. By default that's a StopIteration at the end of the bucket iterator, which isn't ideal as this will get supressed if the function is called within a generator loop further up or anything. There are a couple of places where we expect the file may not exist, so we define a custom exception to rescue specifically here. I did consider subclassing boto's ClientError, but this wasn't straightforward as the constructor expects to know the operation that failed, which for me is a signal that it's not an appropriate (re-)use of the class.	2021-03-15 17:18:11 +00:00
Ben Thorner	b43a367d5f	Relax lookup of letter PDFs in S3 buckets Previously we generated the filename we expected a letter PDF to be stored at in S3, and used that to retrieve it. However, the generated filename can change over the course of a notification's lifetime e.g. if the service changes from crown ('.C.') to non-crown ('.N.'). The prefix of the filename is stable: it's based on properties of the notification - reference and creation - that don't change. This commit changes the way we interact with letter PDFs in S3: - Uploading uses the original method to generate the full file name. The method is renamed to 'generate_' to distinguish it from the new one. - Downloading uses a new 'find_' method to get the filename using just its prefix, which makes it agnostic to changes in the filename suffix. Making this change helps to decouple our code from the requirements DVLA have on the filenames. While it means more traffic to S3, we rely on S3 in any case to download the files. From experience, we know S3 is highly reliable and performant, so don't anticipate any issues. In the tests we favour using moto to mock S3, so that the behaviour is realistic. There are a couple of places where we just mock the method, since what it returns isn't important for the test. Note that, since the new method requires a notification object, we need to change a query in one place, the columns of which were only selected to appease the original method to generate a filename.	2021-03-15 13:55:44 +00:00
David McDonald	41d95378ea	Remove everything for the performance platform We no longer will send them any stats so therefore don't need the code - the code to work out the nightly stats - the performance platform client - any configuration for the client - any nightly tasks that kick off the sending off the stats We will require a change in cronitor as we no longer will have this task run meaning we need to delete the cronitor check.	2021-03-15 12:04:53 +00:00
David McDonald	8325431462	Move saving of processing time into separate task We current do this as part of send-daily-performance-platform-stats but now this moves it into its own separate task. This is for two reasons - we will shortly get rid of the send-daily-performance-platform-stats task as we no longer will need to send anything to performance platform - even if we did decide to keep the task send-daily-performance-platform-stats and remove the specific bits that relate to the performance platform, it's probably nicer to rewrite the new task from scratch to make sure it's all clear and easy to understand	2021-03-15 11:44:01 +00:00
Leo Hemsted	99cd312b0e	Merge pull request #3171 from alphagov/new-invite-endpoints New invite endpoints	2021-03-12 14:16:19 +00:00
Leo Hemsted	ebd4eda8bd	remove duplicate dao invite fns and improve naming	2021-03-12 13:56:05 +00:00
Leo Hemsted	58896e194d	add new `invite/<token_type>/check/<token>` endpoint having `/invite/service/<token>` and `/invite/service/<id>` as two separate routes (the first to validate an invite token, the second to retrieve invite metadata) technically works. Routes are matched from first to last until a match is found. The metadata endpoint only accepts UUIDs, so requests with a UUID will be picked up by the correct endpoint, while requests that don't look like a UUID will carry on searching for an endpoint, and will find the token validation endpoint. So while this works correctly for our normal expected input, it only does so _because the UUID endpoint is first in the file_. This isn't great, and it makes it harder to reason about the URLs when looking at them. To solve this, create the new `invite/service/check/<token>` endpoint. For backwards compatibility, assign this in parallel with the existing route - once the admin uses the new route we can remove the old route and make better guarantees about what endpoint is being hit.	2021-03-12 13:56:01 +00:00
Leo Hemsted	eab4a75e29	remove the global_invite (accept_invite) endpoint move the endpoints into service_invite/rest.py and organisation/invite_rest.py respectively. Remove the prefix from all the blueprints to allow this.	2021-03-12 13:55:55 +00:00

1 2 3 4 5 ...

4611 Commits