Building Your Serverless Sandbox: A Detailed Guide to Multi-Environment Deployments (or How I Learned to Stop Worrying and Love the Cloud)

Introduction

Welcome, intrepid serverless adventurers! In the wild world of cloud computing, creating a robust, multi-environment deployment pipeline is crucial for maintaining code quality and ensuring smooth transitions from development to production.
Here is part 1 and part 2 of this series. Feel free to read them before continuing on.

This guide will walk you through the process of setting up a serverless sandbox using GitLab CI/CD, Terraform, and AWS. By the end, you’ll have a flexible, scalable system that allows developers to work independently while maintaining a clear path to production.

Here is a diagram of what we are building. Feel free to read the first 2 blog posts to get a refresher of what we are building.

Prerequisites

Before we dive in, make sure you have:

An AWS account with appropriate permissions
A GitLab account (self-hosted or GitLab.com)
Terraform (version 0.14 or later) installed locally
Git installed locally (because time travel is essential in software development)
Basic familiarity with YAML, Terraform, and AWS services (or a high tolerance for learning curves)
A sense of humor (trust me, you’ll need it)

Setting up OIDC Integration between GitLab and AWS IAM

Before we set up our GitLab repository, we need to configure the OIDC integration between GitLab and AWS IAM. If you feel like reading the details, here is the documentation.

Create an Identity Provider in AWS IAM

Log in to your AWS Management Console and navigate to the IAM service.
In the left sidebar, click on “Identity providers” under “Access management”.
Click the “Add provider” button.
Select “OpenID Connect” as the provider type.
For the provider URL, enter: https://gitlab.com f. For the Audience, enter: https://gitlab.com
Click “Get thumbprint” to retrieve the server certificate thumbprint.
Click “Add provider” to create the Identity Provider

Create an IAM Role for GitLab

Log in to your AWS Management Console.
Navigate to the IAM service:
- Click on “Services” at the top of the page.
- Under “Security, Identity, & Compliance”, click on “IAM”.
In the left sidebar of the IAM dashboard, click on “Roles”.
Click the “Create role” button.
Under “Trusted entity type”, select “Web identity”.
In the “Web identity” section:
- For “Identity provider”, select the GitLab provider you created earlier (it should be listed as “gitlab.com”).
- For “Audience”, select “https://gitlab.com” from the dropdown.
Click “Next” to proceed to the permissions page.
In the “Add permissions” page:
- Use the search bar to find and select the policies needed for your Terraform operations. Common policies might include:
  - AmazonS3FullAccess
  - AWSLambda_FullAccess
  - AmazonDynamoDBFullAccess
  - AmazonAPIGatewayAdministrator
  - CloudWatchLogsFullAccess
- Select any other policies required for your specific infrastructure needs.
After selecting all necessary policies, click “Next”.
On the “Name, review, and create” page:
- Enter a role name (e.g., “GitLabOIDCRole”).
- (Optional) Enter a description for the role.
- Review the trusted entities and permissions to ensure they’re correct.
Click “Create role” at the bottom of the page.
You’ll be redirected to the Roles page. Find and click on the role you just created.
On the role’s summary page, note the “Role ARN” at the top. You’ll need this ARN when configuring your GitLab CI/CD variables.

Remember, the exact permissions you attach to this role should align with the principle of least privilege. Only grant the permissions necessary for your specific Terraform operations and AWS resource management needs.

Setting Up Your GitLab Repository

Feel free to fork my repo. You will get better training value if you do everything from scratch though.

Log into your GitLab account and create a new project. Try to resist the urge to name it “yet-another-project-that-will-change-the-world”. Clone the repository to your local machine:
git clone https://gitlab.com/your-username/your-project.git cd your-project
In the project settings, navigate to Settings > CI/CD > Variables.
Add the following variable:
AWS_GITLAB_ROLE_ARN: The ARN of the IAM role that GitLab will assume. Make the variable masked. Do not protect this variable. Protection implies that this variable will be unavailable to any branch that is not protected. Therefore if you attempt to deploy your infrastructure from an unprotected branch, your pipeline will fail as it will not have access to this variable.

Crafting Your Terraform Code

Create the following directory structure in your project. Or feel free to fork my GitLab repo if you want to make it quick.

.
├── .gitignore               # Where we hide our secrets and shame
├── .gitlab-ci.yml           # The conductor of our CI/CD orchestra
├── .terraform.lock.hcl      # Terraform's version of a bouncer
├── main.tf                  # The star of our Terraform show
├── resources.tf             # Where we define our cloud real estate
├── variables.tf             # Because hardcoding is so last century
└── envs/
    ├── common.tfvars        # Shared variables, like that one pizza everyone agrees on
    ├── pre-prod-hotfix.tfvars       # For when prod is on fire, but we're still cautious
    ├── pre-prod-minor-release.tfvars # For those "it's not a bug, it's a feature" moments
    ├── pre-prod.tfvars              # Almost ready for the big leagues
    ├── production.tfvars            # Where the magic happens (and the stress levels peak)
    ├── sandbox-a.tfvars             # A's sandbox. No B's allowed!
    ├── sandbox-b.tfvars             # B's revenge
    ├── sandbox-c.tfvars             # C what we did there?
    ├── sandbox-d.tfvars             # D-lightful
    ├── sandbox-hotfix.tfvars        # For those "oops" moments
    └── sandbox-minor-release.tfvars # For when you want to release, but not too much

In main.tf, define your AWS provider. For testing, we will simply create an S3 bucket in resources.tf file. Feel free to create any additional terraform resources but any resources you declare in the terraform code must be creatable by the IAM role you created above or else, creation will fail.

Configuring the GitLab CI/CD Pipeline

Understanding Pipeline Variables

At the top of your .gitlab-ci.yml file, you’ll find several variables defined.

variables:
  TF_ROOT: ${CI_PROJECT_DIR}
  TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}
  TF_IN_AUTOMATION: true
  TF_VAR_CI_JOB_ID: ${CI_JOB_ID}
  TF_VAR_CI_COMMIT_SHA: ${CI_COMMIT_SHA}
  TF_VAR_CI_JOB_STAGE: ${CI_JOB_STAGE}
  TF_VAR_CI_PROJECT_ID: ${CI_PROJECT_ID}
  TF_VAR_CI_PROJECT_NAME: ${CI_PROJECT_NAME}
  TF_VAR_CI_PROJECT_NAMESPACE: ${CI_PROJECT_NAMESPACE}
  TF_VAR_CI_PROJECT_PATH: ${CI_PROJECT_PATH}
  TF_VAR_CI_PROJECT_URL: ${CI_PROJECT_URL}
  TF_HTTP_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}
  TF_HTTP_USERNAME: "gitlab-ci-token"
  TF_HTTP_PASSWORD: ${CI_JOB_TOKEN}

These variables serve various purposes:

TF_ROOT: Defines the root directory for Terraform operations.
TF_ADDRESS and TF_HTTP_ADDRESS: Specify the address for the Terraform HTTP backend, which stores the state files. In our case, it is the GitLab HTTP backend.
TF_IN_AUTOMATION: Tells Terraform it’s running in an automated environment.
TF_VAR_*: These variables are passed to Terraform as input variables, allowing you to use GitLab CI/CD information in your Terraform configurations.

TF_HTTP_USERNAME and TF_HTTP_PASSWORD: Provide authentication for the Terraform HTTP backend.

Cache and Before Script Blocks

# Cache Terraform files to speed up subsequent runs
cache:
  key: tf-pipeline-$CI_COMMIT_REF_SLUG
  paths:
    - ${TF_ROOT}/.terraform
    - ${TF_ROOT}/.terraform.d/plugin-cache

# Set up the working directory and install necessary tools
before_script:
  - cd ${TF_ROOT}
  - apk add --no-cache curl jq

The cache block helps speed up subsequent pipeline runs by caching the .terraform directory, which contains downloaded providers and modules. The before_script block runs at the start of every job. It changes to the Terraform root directory and installs curl and jq, which are often useful for script operations.

Pipeline Stages

# Define the stages of the pipeline
stages:
  - prepare
  - build
  - deploy
  - promote
  - destroy

These stages are like the circle of life for your infrastructure. From nothing, to something, to something better, and then back to nothing. It’s poetic, really. These stages define the order of operations in your pipeline:

prepare: Initial setup and validation
build: Creating the Terraform plan
deploy: Applying the Terraform plan
promote: Moving changes between environments
destroy: Tearing down the infrastructure (usually manual)

Reusable Templates

The pipeline uses several reusable templates:

setup_aws_config: Configures AWS credentials using OIDC.
validate_script: Runs terraform validate to check for configuration errors.
plan_script: Creates a Terraform plan.
apply_script: Applies the Terraform plan.
destroy_script: Destroys the Terraform-managed infrastructure.
promotion_script: Handles promoting changes between environments.

These templates encapsulate common operations, making the pipeline more maintainable and reducing duplication. Read the optional section at the end for details of these scripts.

YAML Anchors and Aliases

YAML anchors and aliases are used to avoid repetition in the pipeline file. For example:

.setup_aws_config: &setup_aws_config
  # Configuration here

job_name:
  script:
    - *setup_aws_config
    # Other script steps

The &setup_aws_config defines an anchor, and *setup_aws_config references it, allowing you to reuse the configuration without duplication.

Using ‘extends’ for Job Templates

The extends keyword is used to inherit configuration from job templates:

.sandbox_plan:
  # Template configuration

sandbox-a-plan:
  extends: .sandbox_plan
  # Additional or overridden configuration

This allows you to define common job configurations once and reuse them across multiple jobs.

ID Token Usage for IAM Role Credentials

id_tokens:
  GITLAB_OIDC_TOKEN:
    aud: https://gitlab.com

This configuration requests an OIDC token from GitLab, which is then used to assume an AWS IAM role. This provides secure, temporary AWS credentials without needing to store long-lived access keys.

Rules for Conditional Job Execution

Rules are used to control when jobs run:

rules:
  - if: $CI_COMMIT_BRANCH != "main"

This example rule ensures the job only runs for branches other than “main”.

Artifacts Passing from Plan to Apply Jobs

artifacts:
  name: plan
  paths:
    - ${TF_ROOT}/${CI_ENVIRONMENT_NAME}

This configuration saves the Terraform plan as an artifact, which can then be used by the apply job. This ensures that what’s applied is exactly what was planned.
Environments, Deployment Tiers, and Actions

In our GitLab CI/CD pipeline, we use environments, deployment tiers, and actions to organize and manage our deployments effectively. These elements work together to provide a clear structure for our multi-environment deployment strategy.

Environments

Environments in GitLab represent different stages in your software development lifecycle. In our pipeline, we have several environments:

environment:
  name: sandbox-a
  deployment_tier: development
  action: start

sandbox-a, sandbox-b, sandbox-c, sandbox-d: Individual developer environments
sandbox-hotfix, sandbox-minor-release: Special sandboxes for hotfixes and minor releases
pre-prod, pre-prod-hotfix, pre-prod-minor-release: Pre-production environments
production: The live production environment

Each environment is isolated, allowing developers to work independently without affecting others or the production system.

Deployment Tiers

Deployment tiers categorize your environments based on their purpose in the development lifecycle. In our pipeline, we use three tiers:

development: For sandbox environments where initial development and testing occur
testing: For pre-production environments where integration testing and final checks are performed
production: For the live production environment

Using deployment tiers helps in visualizing the progression of changes through your pipeline and can be used to apply different rules or approvals based on the tier.

Actions

Actions define what operation is being performed on the environment. Common actions include:

start: Indicates that the job is deploying to or starting up the environment
stop: Used when tearing down or stopping an environment
prepare: For jobs that set up an environment but don’t deploy to it
verify: For jobs that run tests or checks on an environment

In our pipeline, we primarily use the start action as we’re deploying to our environments:

environment:
  name: sandbox-a
  deployment_tier: development
  action: start

How They Work Together in the Pipeline

Sandbox Deployments: When a developer pushes code to a branch, it triggers a deployment to their sandbox environment:
Promotion to Pre-production: When code is ready for further testing, it’s promoted to a pre-production environment
Production Deployment: Finally, when code is fully tested and ready for release, it’s deployed to production

By using these elements, our pipeline provides:

Clear visualization of where code is deployed
Isolated environments for development and testing
A structured promotion process from development to production
Ability to apply different rules and approvals based on environment and tier
Easy tracking of deployment history for each environment

Promotion Paths and Script

The promotion script is used to promote changes from a lower environment to a higher one, ensuring that exactly what was deployed in the lower environment is replicated in the higher one.

There are three main promotion paths:

sandbox-a -> pre-prod -> prod
sandbox-hotfix -> pre-prod-hotfix -> prod
sandbox-minor-release -> pre-prod-minor-release -> prod

Each path serves a different purpose:

The first is for regular feature development and releases.
The second is for urgent fixes that need to bypass the regular release cycle.
The third is for planned minor releases that may include multiple features or fixes.

Managing Multiple Terraform States

GitLab’s HTTP backend is used to manage multiple Terraform states. The state file name is based on the environment name:

TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}

Testing Your Setup

To test your setup:

Make changes to your Terraform configuration.
Commit and push your changes to GitLab.
Observe the pipeline execution in GitLab CI/CD interface.
Verify that resources are created in AWS as expected.
Test the promotion process by manually triggering the promotion jobs.

Best Practices and Tips

Use clear, consistent naming conventions for all resources and interpolate the environment name into resource name so that if there are multiple resource being created in the same AWS account, they do not come into naming conflicts.
Implement proper IAM policies to restrict access between environments.
Use GitLab’s environment-scoped variables for sensitive information.
Regularly clean up unused resources to avoid unnecessary costs.

Conclusion

This guide has walked you through setting up a sophisticated multi-environment deployment pipeline for serverless applications using GitLab CI/CD, Terraform, and AWS. By leveraging GitLab environments, Terraform state management, and AWS OIDC integration, we’ve created a system that allows for isolated development while maintaining a clear path to production.

This setup provides great flexibility for development teams, supporting various workflows including feature development, hotfixes, and minor releases. However, it’s important to maintain discipline in your development and deployment processes to fully benefit from this setup.

Remember, while this guide provides a solid foundation, you may need to adjust and expand upon it to fit your specific needs and workflows. Happy serverless coding!

Optional Deep Dive into Reusable Templates

For those who want to understand the inner workings of our pipeline, let’s break down each of our reusable templates and explain how their commands function.

setup_aws_config: Configuring AWS Credentials Using OIDC

.setup_aws_config: &setup_aws_config |
  mkdir -p ~/.aws
  echo "${GITLAB_OIDC_TOKEN}" > /tmp/gitlab-oidc-token
  echo "[profile web-identity]" > ~/.aws/config
  echo "role_arn = ${AWS_GITLAB_ROLE_ARN}" >> ~/.aws/config
  echo "web_identity_token_file = /tmp/gitlab-oidc-token" >> ~/.aws/config
  echo "role_session_name = GitLabOIDCSession-${CI_PROJECT_ID}-${CI_PIPELINE_ID}" >> ~/.aws/config
  export AWS_PROFILE="web-identity"

mkdir -p ~/.aws: Creates the AWS configuration directory if it doesn’t exist.
echo "${GITLAB_OIDC_TOKEN}" > /tmp/gitlab-oidc-token: Saves the GitLab-provided OIDC token to a temporary file.
The next four echo commands create an AWS config file with a web-identity profile:
- Specifies the IAM role to assume
- Points to the OIDC token file
- Sets a unique session name
export AWS_PROFILE="web-identity": Sets the AWS CLI to use this profile

This setup allows the pipeline to authenticate with AWS using the OIDC token, enhancing security by avoiding long-lived access keys.

validate_script: Running Terraform Validate

.validate_script:
  script:
    - terraform_validate() {
        local chdir_opt=""
        if [ -n "${TF_ROOT}" ]; then
          chdir_opt="-chdir=${TF_ROOT}"
        fi
        terraform ${chdir_opt} init -backend=false -input=false
        terraform ${chdir_opt} validate
      }
    - terraform_validate

Defines a terraform_validate function.
If TF_ROOT is set, it prepares a -chdir option to run Terraform in a specific directory.
Runs terraform init with -backend=false to initialize Terraform without configuring a backend.
Runs terraform validate to check the configuration for errors.

This script ensures that your Terraform configurations are syntactically correct and internally consistent.

plan_script: Creating a Terraform Plan

.plan_script: &plan_script
  script:
    - *setup_aws_config
    - |
      terraform_plan() {
        local chdir_opt=""
        if [ -n "${TF_ROOT}" ]; then
          chdir_opt="-chdir=${TF_ROOT}"
        fi
        local plan_out="${CI_ENVIRONMENT_NAME}"
        if [ "${TF_IMPLICIT_INIT:-true}" = "true" ]; then
          terraform ${chdir_opt} init -input=false ${TF_INIT_FLAGS}
        fi
        terraform ${chdir_opt} plan -input=false -out="${plan_out}" -var-file envs/${CI_ENVIRONMENT_NAME}.tfvars -var-file envs/common.tfvars
      }
    - terraform_plan

Calls setup_aws_config to set up AWS credentials.
Defines a terraform_plan function.
Sets up the -chdir option if TF_ROOT is defined.
If TF_IMPLICIT_INIT is true, runs terraform init.
Runs terraform plan, outputting the plan to a file named after the current environment.
Uses environment-specific and common variable files.

This script creates a Terraform plan, showing what changes would be made to your infrastructure.

apply_script: Applying the Terraform Plan

.apply_script: &apply_script
  script:
    - *setup_aws_config
    - |
      terraform_apply() {
        local chdir_opt=""
        if [ -n "${TF_ROOT}" ]; then
          chdir_opt="-chdir=${TF_ROOT}"
        fi
        local plan_in="${CI_ENVIRONMENT_NAME}"
        if [ "${TF_IMPLICIT_INIT:-true}" = "true" ]; then
          terraform ${chdir_opt} init -input=false ${TF_INIT_FLAGS}
        fi
        terraform ${chdir_opt} apply -input=false -auto-approve "${plan_in}"
      }
    - terraform_apply

Calls setup_aws_config to set up AWS credentials.
Defines a terraform_apply function.
Sets up the -chdir option if TF_ROOT is defined.
If TF_IMPLICIT_INIT is true, runs terraform init.
Runs terraform apply using the plan file created in the plan stage.

This script applies the Terraform plan, making the specified changes to your infrastructure.

destroy_script: Destroying Terraform-Managed Infrastructure

.destroy_script: &destroy_script
  script:
    - *setup_aws_config
    - |
      terraform_destroy() {
        local chdir_opt=""
        if [ -n "${TF_ROOT}" ]; then
          chdir_opt="-chdir=${TF_ROOT}"
        fi
        if [ "${TF_IMPLICIT_INIT:-true}" = "true" ]; then
          terraform ${chdir_opt} init -input=false ${TF_INIT_FLAGS}
        fi
        terraform ${chdir_opt} destroy -input=false -auto-approve -var-file envs/${CI_ENVIRONMENT_NAME}.tfvars -var-file envs/common.tfvars
      }
    - terraform_destroy

Calls setup_aws_config to set up AWS credentials.
Defines a terraform_destroy function.
Sets up the -chdir option if TF_ROOT is defined.
If TF_IMPLICIT_INIT is true, runs terraform init.
Runs terraform destroy, using environment-specific and common variable files.

This script destroys all Terraform-managed infrastructure in the specified environment.

promotion_script: Promoting Changes Between Environments

.promotion_script: &promotion_script |
  # Function to apply configuration to target environment
  apply_to_target() {
    local source_env=$1
    local target_env=$2
    
    echo "Promoting ${source_env} to ${target_env}"

    # Create the directory if it doesn't exist
    mkdir -p /tmp/source_repo

    # Extract the artifact from the source environment
    tar -xzf repo-${source_env}.tar.gz -C /tmp/source_repo

    # Change to the extracted directory
    cd /tmp/source_repo

    # Initialize Terraform
    terraform init

    # Apply the configuration to the target environment
    echo "target environment is ${target_env}"
    TF_ADDRESS=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${target_env}
    terraform apply -auto-approve -var-file envs/${target_env}.tfvars -var-file envs/common.tfvars

    if [ $? -ne 0 ]; then
      echo "Promotion failed. Please check the logs and try again."
      exit 1
    fi
  }

Defines an apply_to_target function that takes source and target environments as parameters.
Creates a temporary directory and extracts the source environment’s artifact into it.
Changes to the extracted directory and initializes Terraform.
Sets the Terraform state address for the target environment.
Applies the Terraform configuration to the target environment, using the target’s variable files.
Checks the exit status and fails the job if the promotion was unsuccessful.

This script handles the promotion of changes from one environment to another, ensuring that the exact configuration from the source environment is applied to the target environment.

These reusable templates form the backbone of our CI/CD pipeline, handling everything from AWS authentication to infrastructure deployment and promotion between environments. By understanding these scripts, you can better customize and troubleshoot your pipeline as needed.

Code is Life