9.6 KiB
Run Book
Policies and Procedures needed before and during US Notify Operations
- Alerts, Notifications, Monitoring
- Restaging Apps
- Smoke-testing the App
- Configuration Management
- Known Gotchas
- User Account Management
- SMS Phone Number Management
Alerts, Notifications, Monitoring
Operational alerts are posted to the #pb-notify-alerts Slack channel. Please join this channel and enable push notifications for all messages whenever you are on call.
NewRelic is being used for monitoring the application.
Cloud.gov Logging is used to view and search application and platform logs.
Restaging Apps
Our apps must be restaged whenever cloud.gov releases updates to buildpacks. Cloud.gov will send email notifications whenever buildpack updates affect a deployed app.
Restaging the apps rebuilds them with the new buildpack, enabling us to take advantage of whatever bugfixes or security updates are present in the new buildpack.
There are two GitHub Actions that automate this process. Each are run manually and must be run once for each environment to enable testing any changes in staging before running within demo and production environments.
When notify-api-<env>, notify-admin-<env>, egress-proxy-notify-api-<env>, and/or egress-proxy-notify-admin-<env> need to be restaged:
- Navigate to the Restage apps GitHub Action
- Click the
Run workflowbutton to open a popup - Leave
Use workflow fromon it's default ofBranch: main - Select the environment you need to restage from the dropdown
- Click
Run workflowwithin the popup - Repeat for other environments
When ssb-sms, and/or ssb-smtp need to be restaged:
- Navigate to the SSB Restage apps GitHub Action
- Click the
Run workflowbutton to open a popup - Leave
Use workflow fromon it's default ofBranch: main - Select the environment (either
stagingorproduction) you need to restage from the dropdown - Click
Run workflowwithin the popup - Repeat for other environments
When ssb-devel-sms and/or ssb-devel-smtp need to be restaged:
- Navigate to the SSB Restage apps GitHub Action
- Click the
Run workflowbutton to open a popup - Leave
Use workflow fromon it's default ofBranch: main - Select the
developmentenvironment from the dropdown - Click
Run workflowwithin the popup
Smoke-testing the App
To ensure that notifications are passing through the application properly, the following steps can be taken to ensure all parts are operating correctly:
- Send yourself a password reset email. This will verify SES integration. The email can be deleted once received if you don't wish to change your password.
- Log into the app. This will verify SNS integration for a one-off message.
- Upload a CSV and schedule send for the soonest time after "Now". This will verify S3 connections as well as scheduler and worker processes are running properly.
Configuration Management
Also known as: How to move code from my machine to production
Common Policies and Procedures
- All changes must be made in a feature branch and opened as a PR targetting the
mainbranch. - All PRs must be approved by another developer
- PRs to
mainandproductionbranches must be merged by a someone with theAdministratorrole. - PR documentation includes a Security Impact Analysis
- PRs that will impact the Security Posture must be approved by the US Notify ISSO.
- Any PRs waiting for approval should be talked about during daily Standup meetings.
notifications-api & notifications-admin
- Changes are deployed to the
stagingenvironment after a successfulchecks.ymlrun onmainbranch. Branch Protections prevent pushing directly tomain - Changes are deployed to the
demoandproductionenvironments after mergingmainintoproduction. Branch Protections prevent pushing directly toproduction
usnotify-ssb
- Changes are deployed to
stagingandproductionenvironments after merging to themainbranch. Thestagingdeployment must be successful beforeproductionis attempted. Branch Protections prevent pushing directly tomain
ttsnotify-brokerpak-sms
- A new release is created by pushing a tag to the repository on the
mainbranch. - To include the new version in released SSB code, create a PR in the
usnotify-ssbrepo updating the version in use inapp-setup-sms.sh
datagov-brokerpak-smtp
- To include new verisons of the SMTP brokerpak in released SSB code, create a PR in the
usnotify-ssbrepo updating the version in use inapp-setup-smtp.sh
Vulnerability Mitigation Changes
US_Notify Administrators are responsible for ensuring that remediations for vulnerabilities are implemented. Response times vary based on the level of vulnerability as follows:
- Critical (Very High) - 15 days
- High - 30 days
- Medium - 90 days
- Low - 180 days
- Informational - 365 days (depending on the analysis of the issue)
Known Gotchas
SSB Service Bindings are failing
- Problem:
- Creating or deleting service keys is failing. SSB Logs reference failing to verify certificate/certificate valid for
GUID Abut not forGUID B - Solution:
- Restage SSB apps using the restage apps action
SNS Topic Subscriptions Don't Succeed
- Problem:
- When deploying a new environment, a race condition prevents SNS topic subscriptions from being successfully verified on the AWS side
- Solution:
- Manually re-request subscription confirmation from the AWS Console.
User Account Management
Important policies:
- Infrastructure Accounts and Application Platform Administrators must be approved by the System Owner (Amy) before creation, but people with
Administratorrole can actually do the creation and role assignments. - At least one agency partner must act as the
User Managerfor their service, with permissions to manage their team according to their agency's policies and procedures. - All users must utilize
.govemail addresses. - Users who leave the team or otherwise have role changes must have their accounts updated to reflect the new roles required (or disabled) within 14 days.
- SpaceDeployer credentials must be rotated within 14 days of anyone with SpaceDeveloper cloud.gov access leaving the team.
Types of Infrastructure Users
| Role Name | System | Permissions | Who | Responsibilities |
|---|---|---|---|---|
| Administrator | GitHub | Admin | PBS Fed | Approve & Merge PRs into main and production |
| Administrator | AWS | NotifyAdministrators IAM UserGroup |
PBS Fed | Read audit logs, verify & fix any AWS service issues within Production AWS account |
| Administrator | Cloud.gov | OrgManager |
PBS Fed | Manage cloud.gov roles and permissions. Access to production spaces |
| DevOps Engineer | Cloud.gov | SpaceManager |
PBS Fed or Contractor | Access to non-production spaces |
| DevOps Engineer | AWS | NotifyAdministrators IAM UserGroup |
PBS Fed or Contractor | Access to non-production AWS accounts to verify & fix any AWS issues in the lower environments |
| Engineer | GitHub | Write | PBS Fed or Contractor | Write code & issues, submit PRs |
Types of Application Users
| Role Name | Permissions | Who | Responsibilities |
|---|---|---|---|
| Platform Administrator | platform_admin |
PBS Fed | Administer system settings within US Notify across Services |
| User Manager | MANAGE_USERS |
Agency Partner | Manage service team members |
| User | any except MANAGE_USERS |
Agency Partner | Use US Notify |
Service Accounts
| Role Name | System | Permissions | Notes |
|---|---|---|---|
| Cloud.gov Service Account | Cloud.gov | OrgManager and SpaceDeveloper |
Creds stored in GitHub Environment secrets within api and admin app repos |
| SSB Deployment Account | AWS | IAMFullAccess |
Creds stored in GitHub Environment secrets within usnotify-ssb repo |
| SSB Cloud.gov Service Account | Cloud.gov | SpaceDeveloper |
Creds stored in GitHub Environment secrets within usnotify-ssb repo |
| SSB AWS Accounts | AWS | sms_broker or smtp_broker IAM role |
Creds created and maintained by usnotify-ssb terraform |
SMS Phone Number Management
See Infrastructure Overview for information about SMS phone numbers in AWS.
Once you have a number, it must be set in the app in one of two ways:
- For the default phone number, to be used by Notify itself for OTP codes and the default from number for services, set the phone number as the
AWS_US_TOLL_FREE_NUMBERENV variable in the environment you are creating - For service-specific phone numbers, set the phone number in the Service's
Text message sendersin the settings tab.
Current Production Phone Numbers
- +18447952263 - in use as default number. Notify's OTP messages and trial service messages are sent from this number
- +18447891134 - to be used by Pilot Partner 1
- +18888402596 - to be used by Pilot Partner 2