Back to Blog
DevOps
Cloud
Architecture

Infrastructure as Code with Terraform on AWS: Patterns for Teams

Module structure, remote state management, workspace strategy, and CI/CD integration patterns that scale from a 3-person startup to a 50-engineer platform team.

9 min readMay 20, 2026Netvionix Team
Infrastructure as Code with Terraform on AWS: Patterns for Teams

Why Terraform Discipline Matters

Terraform is straightforward when one person manages one environment. It becomes a coordination problem — and a source of outages — when five engineers are managing eight environments across three AWS accounts.

These patterns come from hard lessons on teams where "just run terraform apply" caused production incidents.


Project Structure

The most important decision: one state file per logical unit of isolation.

infrastructure/
├── modules/
│   ├── networking/          # VPC, subnets, NAT, SGs
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── ecs-service/         # ECS task + service + ALB target group
│   ├── rds-postgres/        # RDS cluster + parameter group + SGs
│   └── s3-bucket/           # S3 + policy + lifecycle rules
│
├── environments/
│   ├── prod/
│   │   ├── networking/      # Separate state per layer
│   │   │   └── main.tf
│   │   ├── databases/
│   │   │   └── main.tf
│   │   └── services/
│   │       └── main.tf
│   ├── staging/
│   └── dev/
│
└── .github/
    └── workflows/
        └── terraform.yml

Never put networking, databases, and application services in the same state file. A failed app deploy should never be able to destroy your VPC.


Remote State with S3 and DynamoDB Locking

# environments/prod/networking/main.tf

terraform {
  backend "s3" {
    bucket         = "your-company-tfstate-prod"
    key            = "prod/networking/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-locks"
  }

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

# Reference outputs from another state file
data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    bucket = "your-company-tfstate-prod"
    key    = "prod/networking/terraform.tfstate"
    region = "us-east-1"
  }
}

module "api_service" {
  source = "../../../modules/ecs-service"

  vpc_id          = data.terraform_remote_state.networking.outputs.vpc_id
  private_subnets = data.terraform_remote_state.networking.outputs.private_subnet_ids
  # ...
}

Create the S3 bucket and DynamoDB table with versioning enabled before anything else. This is the one thing you manage manually.


Module Design Rules

A good Terraform module has:

  1. One clear purposerds-postgres creates exactly one RDS instance. Not "the entire database layer."
  2. No hard-coded values — everything is a variable with a sensible default
  3. Outputs for everything downstream might need
# modules/rds-postgres/variables.tf
variable "identifier" {
  description = "Unique identifier for this RDS instance"
  type        = string
}

variable "instance_class" {
  description = "RDS instance type"
  type        = string
  default     = "db.t3.medium"
}

variable "allocated_storage_gb" {
  type    = number
  default = 20
}

variable "deletion_protection" {
  description = "Prevent accidental deletion. Set to true in production."
  type        = bool
  default     = true
}

CI/CD Integration with GitHub Actions

Never let engineers run terraform apply from their laptops against production.

# .github/workflows/terraform.yml
name: Terraform

on:
  pull_request:
    paths: ["infrastructure/**"]
  push:
    branches: [main]
    paths: ["infrastructure/**"]

jobs:
  plan:
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.7.0"

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/GitHubActionsReadOnly
          aws-region: us-east-1

      - name: Terraform Plan
        working-directory: infrastructure/environments/prod/services
        run: |
          terraform init
          terraform plan -no-color -out=tfplan

      - name: Comment Plan on PR
        uses: actions/github-script@v7
        with:
          script: |
            const output = `\${{ steps.plan.outputs.stdout }}`;
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              body: `### Terraform Plan\n\`\`\`\n${output}\n\`\`\``
            });

  apply:
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main' && github.event_name == 'push'
    environment: production   # Requires manual approval in GitHub
    steps:
      - uses: actions/checkout@v4
      - name: Terraform Apply
        run: terraform apply -auto-approve

The Rules We Enforce

  1. Plan on PR, apply on merge — no exceptions for production
  2. deletion_protection = true on all stateful resources in prod
  3. Never use count for resources with identity (databases, IAM roles) — use for_each with a map
  4. terraform fmt enforced in CI — inconsistent formatting is a merge blocker
  5. terraform validate and tfsec run on every PR

These constraints feel slow until the day they prevent a 2-hour production outage.