mirror of https://github.com/GSA/notifications-api.git synced 2026-01-28 05:21:54 -05:00

Files

Cliff Hill 85aaa2952d Removed old redis from terraform.

Signed-off-by: Cliff Hill <Clifford.hill@gsa.gov>

2024-06-27 10:56:32 -04:00

bootstrap

Upgrade terraform-cloudgov, add prevent_destroy

2024-06-17 17:23:00 -07:00

demo

Removed old redis from terraform.

2024-06-27 10:56:32 -04:00

development

Merge pull request #923 from GSA/jskinne3-upgrade-terraform-version

2024-04-23 17:53:45 -04:00

ops

reformat

2023-08-29 14:54:30 -07:00

production

Removed old redis from terraform.

2024-06-27 10:56:32 -04:00

sandbox

Remove old version of Redis from Staging (and Sandbox)

2024-06-25 12:21:43 -07:00

shared

Terraform format the SNS and SES modules

2024-06-18 17:43:38 -07:00

staging

Removed old redis from terraform.

2024-06-27 10:56:32 -04:00

create_service_account.sh

Update cloud.gov org references

2023-08-25 12:16:57 -04:00

destroy_service_account.sh

Update cloud.gov org references

2023-08-25 12:16:57 -04:00

README.md

Add my recent experience to Troubleshooting part of README

2024-05-31 20:11:51 -07:00

set_space_egress.sh

Update cloud.gov org references

2023-08-25 12:16:57 -04:00

README.md

Terraform

This directory holds the Terraform modules for maintaining Notify.gov's API infrastructure. You might want to:

read about the directory structure, or
get set up to develop HCL code.

The Admin app repo has its own terraform directory but a lot of the below instructions apply to both apps.

Retrieving existing bucket credentials

📗 New developers start here!

Assuming initial setup is complete — which it should be if Notify.gov is online — Terraform state is stored in a shared remote backend. If you are going to be writing Terraform for any of our deployment environments you'll need to hook up to this backend. (You don't need to do this if you are just writing code for the development module, because it stores state locally on your laptop.)

Enter the bootstrap module with cd bootstrap
Run ./import.sh to import the bucket containing remote terraform state into your local state
Follow instructions under Use bootstrap credentials

Use bootstrap credentials

Run ./run.sh show -json.
In the output, locate access_key_id and secret_access_key within the bucket_creds resource. These values are secret, so don't share them with anyone or copy them to anywhere online.

Add the following to ~/.aws/credentials:

[notify-terraform-backend]
aws_access_key_id = <access_key_id>
aws_secret_access_key = <secret_access_key>

Check which AWS profile you are using with aws configure list. If needed, use export AWS_PROFILE=notify-terraform-backend to change to the profile and credentials you just added.

These credentials will allow Terraform to access the AWS/Cloud.gov bucket in which developers share Terraform state files. Now you are ready to develop Terraform using the Workflow for deployed environments.

Initial setup

These instructions were used for deploying the project for the first time, years ago. We should not have to perform these steps again. They are provided here for reference.

Manually run the bootstrap module following instructions under Terraform State Credentials
Setup CI/CD Pipeline to run Terraform
1. Copy bootstrap credentials to your CI/CD secrets using the instructions in the base README
2. Create a cloud.gov SpaceDeployer by following the instructions under SpaceDeployers
3. Copy SpaceDeployer credentials to your CI/CD secrets using the instructions in the base README
Manually Running Terraform
1. Follow instructions under Workflow for deployed environments to create your infrastructure

Terraform state credentials

The bootstrap module is used to create an s3 bucket for later terraform runs to store their state in. (If the bucket is already created, you should Use bootstrap credentials)

Bootstrapping the state storage s3 buckets for the first time

Within the bootstrap directory, run terraform init
Run ./run.sh plan to verify that the changes are what you expect
Run ./run.sh apply to set up the bucket
Follow instructions under Use bootstrap credentials
Ensure that import.sh includes a line and correct IDs for any resources created
Run ./teardown_creds.sh to remove the space deployer account used to create the s3 bucket
Copy bucket from bucket_credentials output to the backend block of staging/providers.tf and production/providers.tf

To make changes to the bootstrap module

This should not be necessary in most cases

Run terraform init
If you don't have terraform state locally:
1. run ./import.sh
2. optionally run ./run.sh apply to include the existing outputs in the state file
Make your changes
Continue from step 2 of the boostrapping instructions

SpaceDeployers

A SpaceDeployer account is required to run terraform or deploy the application from the CI/CD pipeline. Create a new account by running:

./create_service_account.sh -s <SPACE_NAME> -u <ACCOUNT_NAME>

SpaceDeployers are also needed to run Terraform locally — they fill user and password input variables (via deployers within main.tf) that some of our Terraform modules require when they start running. Using a SpaceDeployer account locally is covered in the next section.

Workflow for deployed environments

These are the steps for developing Terraform code for our deployed environment modules (sandbox, demo, staging and production) locally on your laptop. Or for setting up a new deployment environment, or otherwise for running Terraform manually in any module that uses remote state. You don't need to do all this to run code in the development module, because it is not a deployed environment and it does not use remote state.

Caution

There is one risky step below (apply) which is safe only in the sandbox environment and should not be run in any other deployed environment.

These steps assume shared Terraform state credentials exist in s3, and that you are Using those credentials.

cd to the environment you plan to work in. When developing new features/resources, try out your code in sandbox. Only once the code is proven should you copy-and-paste it to each higher environment.
Run cf spaces and, from the output, copy the space name for the environment you are working in, such as notify-sandbox.
Next you will set up a SpaceDeployer service account instance. This is something like a stub user account, just for deployment. Note these two values which you will use both to create and destroy the account:
1. <SPACE_NAME> will be the string you copied from the prior step
2. <ACCOUNT_NAME> can be anything, although we recommend something that communicates the purpose of the deployer. For example: "circleci-deployer" for the credentials CircleCI uses to deploy the application, or "sandbox-<your_name>" for credentials to run terraform manually.
Put those two values into this command:
```
../create_service_account.sh -s <SPACE_NAME> -u <ACCOUNT_NAME> > secrets.auto.tfvars
```
The script will output the username (as cf_user) and password (as cf_password) for your <ACCOUNT_NAME>. The cloud.gov service account documentation has more information.

Some resources you might work on require a SpaceDeployer account with higher permissions. Add the -m flag to the command to get this.

The command uses the redirection operator (>) to write that output to the secrets.auto.tfvars file. Terraform will find the username and password there, and use them as input variables.
While still in an environment directory, initialize Terraform:
```
terraform init
```
If this command fails, you may need to run terraform init -upgrade to make sure new module versions are picked up. Or, terraform init -migrate-state to bump the remote backend.
Then, run Terraform in a non-destructive way:
```
terraform plan
```
This will show you any pending changes that Terraform is ready to make.

📝 Now is the time to write any HCL code you are planning to write, re-running terraform plan to confirm that the code works as you develop. Keep in mind that any changes to the codebase that you commit will be run by the CI/CD pipeline.
Only if it is safe to do so, apply your changes.

💀 Applying changes in the wrong directory can mess up a deployed environment that people are relying on

Double-check what directory you are in, like with the pwd command. You should probably only apply while in the sandbox directory / environment.

Once you are sure it is safe, run:
```
terraform apply
```
This command will deploy your changes to the cloud. This is a healthy part of testing your code in the sandbox, or if you are creating a new environment (a new directory). Do not apply in environments that people are relying upon.
Remove the space deployer service instance when you are done manually running Terraform.
```
# <SPACE_NAME> and <ACCOUNT_NAME> have the same values as used above.
./destroy_service_account.sh -s <SPACE_NAME> -u <ACCOUNT_NAME>
```
List cf services if you are unsure which space deployer service instances still exist

Optionally, you can also rm secrets.auto.tfvars

Structure

The terraform directory contains sub-directories (staging, production, etc.) named for deployment environments. Each of these is a module, which is just Terraform's word for a directory with some .tf files in it. Each module governs the infrastructure of the environment for which it is named. This directory structure forms "bulkheads" which isolate Terraform commands to a single environment, limiting accidental damage.

The development module is rather different from the other environment modules. While the other environments can be used to create (or destroy) cloud resources, the development module mostly just sets up access to pre-existing resources needed for local software development.

The bootstrap directory is not an environment module. Instead, it sets up infrastructure needed to deploy Terraform in any of the environments. If you are new to the project, this is where you should start.

Similarly, shared is not an environment. It is a module that lends code to all the environments. Please note that changes to shared codebase will be applied to all envrionments the next time CI/CD (or a user) runs Terraform in that environment.

Warning

Editing shared code is risky because it will be applied to production

Files within these directories look like this:

- bootstrap/
  |- main.tf
  |- providers.tf
  |- variables.tf
  |- run.sh
  |- teardown_creds.sh
  |- import.sh
- <env>/
  |- main.tf
  |- providers.tf
  |- secrets.auto.tfvars
  |- variables.tf

In the environment-specific modules:

providers.tf lists the required providers
main.tf calls the shared Terraform code, but this is also a place where you can add any other services, resources, etc, which you would like to set up for that environment
variables.tf lists the variables that will be needed, either to pass through to the child module or for use in this module
secrets.auto.tfvars is a file which contains the information about the service-key and other secrets that should not be shared

In the bootstrap module:

providers.tf lists the required providers
main.tf sets up s3 bucket to be shared across all environments. It lives in prod to communicate that it should not be deleted
variables.tf lists the variables that will be needed. Most values are hard-coded in this module
run.sh Helper script to set up a space deployer and run terraform. The terraform action (show/plan/apply/destroy) is passed as an argument
teardown_creds.sh Helper script to remove the space deployer setup as part of run.sh
import.sh Helper script to create a new local state file in case terraform changes are needed

Troubleshooting

Expired token

The token expired, was revoked, or the token ID is incorrect. Please log back in to re-authenticate.

You need to re-authenticate with the Cloud Foundry CLI

cf login -a api.fr.cloud.gov --sso

You may also need to log in again to the Cloud.gov website.

CF account not authorized

Error: You are not authorized to perform the requested action

This error indicates that the Cloud Foundry user account (or service account) needs OrgManager permissions to take the action.

When you create a SpaceDeployer service account, use the -m flag when running the ./create_service_account.sh script
Your own CF user may may also require OrgManager permissions to run the script

Services limit

You have exceeded your organization's services limit.

Too many Cloud Foundry services have been created without being destroyed. Perhaps Terraform developers have forgotten to delete their SpaceDeployers after they finish with them. List cf services to see.

Unknown error

Error: Service Instance xx-name-xx failed xx-UUID-xx, reason: [Job (xx-UUID-xx) failed: An unknown error occurred.]

This unhelpful message may be clarified by looking in the Cloud.gov web UI. Among the list of service instances (Cloud Foundry → Organizations → gsa-tts-benefits-studio → Spaces → your-space-name → Service instances) check for pending or erroring items. Refer below if you discover a domain identity verification error.

The audit event logs may also provide insight. They are visible in web UI or in the terminal.

Domain identity verification

Error: Error creating SES domain identity verification: Expected domain verification Success, but was in state Pending

This error comes via the Supplementary Service Broker and originates from the SMTP Brokerpak it uses. You can run the broker provisioning locally to tinker with the error.