Skip to content

Terraform Complete Master Guide

Terraform is an Infrastructure as Code (IaC) tool that lets you define and provision infrastructure using configuration files.


Installation Guide

Install Terraform on Linux (Ubuntu/Debian)

# Install required packages
sudo apt update && sudo apt install -y wget unzip

# Download Terraform (check latest version at https://www.terraform.io/downloads)
wget https://releases.hashicorp.com/terraform/1.6.6/terraform_1.6.6_linux_amd64.zip

# Extract and install
unzip terraform_1.6.6_linux_amd64.zip
sudo mv terraform /usr/local/bin/

# Verify installation
terraform version
# Expected output: Terraform v1.6.6

Install Terraform on macOS

# Using Homebrew (recommended)
brew tap hashicorp/tap
brew install hashicorp/tap/terraform

# Or manually with curl
curl -o terraform.zip https://releases.hashicorp.com/terraform/1.6.6/terraform_1.6.6_darwin_amd64.zip
unzip terraform.zip
sudo mv terraform /usr/local/bin/

# Verify installation
terraform version

Install Terraform on Windows

# Using Chocolatey
choco install terraform

# Or manually download from https://www.terraform.io/downloads
# Extract terraform.exe to C:\Windows\System32\

# Verify in PowerShell
terraform version

Verify Installation & Enable Auto-completion

# Check version
terraform version

# Enable bash completion
terraform -install-autocomplete

# Initialize working directory (creates .terraform folder)
terraform init

BEGINNER LEVEL: Your First Infrastructure

Scenario 1: Creating Your First Terraform File

Understanding the basic workflow: init → plan → apply

sequenceDiagram
    participant User as Developer
    participant Terminal as Command Line
    participant TF as Terraform CLI
    participant Config as main.tf File
    participant State as Terraform State
    participant File as Local File System

    User->>Config: Create main.tf
    Config->>Config: Define local_file resource

    User->>Terminal: terraform init
    Terminal->>TF: Execute init command
    TF->>TF: Download local provider
    TF->>State: Create .terraform directory
    TF->>Terminal: Show "Terraform initialized"

    User->>Terminal: terraform plan
    Terminal->>TF: Execute plan command
    TF->>Config: Read resource definition
    TF->>State: Check current state
    TF->>Terminal: Show "Will create 1 resource"

    User->>Terminal: terraform apply
    Terminal->>TF: Execute apply command
    TF->>User: Prompt for approval
    User->>TF: Type "yes"
    TF->>File: Create hello.txt
    TF->>State: Save resource state
    TF->>Terminal: Show "Apply complete!"

    Note over State: State tracks what Terraform created

Code:

# Create a file named main.tf
# This is your Terraform configuration file

# Provider tells Terraform what platform to use
provider "local" {
  version = "~> 2.4"
}

# Resource defines infrastructure to create
resource "local_file" "welcome" {
  # Content to write to file
  content  = "Hello, this is my first Terraform resource!"

  # File path (creates in current directory)
  filename = "${path.module}/hello.txt"
}

# Output shows information after creation
output "file_location" {
  value = local_file.welcome.filename
}

output "file_size" {
  value = filesize(local_file.welcome.filename)
}

Step-by-step execution:

# 1. Create a project directory
mkdir terraform-beginner
cd terraform-beginner

# 2. Create the main.tf file (paste the code above)
nano main.tf

# 3. Initialize Terraform (downloads provider plugins)
terraform init
# Expected output:
# Initializing the backend...
# Initializing provider plugins...
# - Finding hashicorp/local versions matching "~> 2.4"...
# - Installing hashicorp/local v2.4.0...
# Terraform has been successfully initialized!

# 4. See what Terraform will do (dry run)
terraform plan
# Expected output:
# Plan: 1 to add, 0 to change, 0 to destroy.

# 5. Create the resource (type "yes" when prompted)
terraform apply
# Expected output:
# Do you want to perform these actions?
#   Terraform will perform the actions described above.
#   Only 'yes' will be accepted to approve.
# 
#   Enter a value: yes
#
# local_file.welcome: Creating...
# local_file.welcome: Creation complete after 0s
# Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
#
# Outputs:
# file_location = "/home/user/terraform-beginner/hello.txt"

# 6. Verify the file was created
ls -la hello.txt
cat hello.txt
# Should show: "Hello, this is my first Terraform resource!"

# 7. See current state
terraform show
# Shows the state file contents

# 8. Destroy the resource (cleanup)
terraform destroy
# Expected output:
# Do you really want to destroy all resources? yes
# local_file.welcome: Destroying... [id=6c63204f5a9cfd5f0d8ec0d6bd5d6c82bc173290]
# local_file.welcome: Destruction complete after 0s
# Destroy complete! Resources: 1 destroyed.


Scenario 2: Understanding Variables & Outputs

Making configurations dynamic and reusable

sequenceDiagram
    participant User as Developer
    participant Var as variables.tf
    participant Main as main.tf
    participant TF as Terraform
    participant State as State File

    User->>Var: Define variable inputs
    User->>Main: Reference variables
    Main->>TF: Process variable values

    User->>TF: terraform apply -var="name=prod"
    TF->>State: Store variable values
    TF->>User: Display output values

    Note over Var: Variables make code reusable

Code:

# variables.tf - Define inputs
variable "filename" {
  description = "Name of the file to create"
  type        = string
  default     = "message.txt"
}

variable "content" {
  description = "Content to write in file"
  type        = string
  default     = "Default message"
}

variable "file_permissions" {
  description = "Unix file permissions"
  type        = string
  default     = "0644"
}

# main.tf - Use variables
resource "local_file" "message" {
  content  = var.content
  filename = "${path.module}/${var.filename}"
  file_permission = var.file_permissions
}

# outputs.tf - Show results
output "file_details" {
  description = "Details about created file"
  value = {
    name        = var.filename
    size_bytes  = filesize(local_file.message.filename)
    permissions = var.file_permissions
  }
}

output "created_at" {
  value = timestamp()
}

Execution with variables:

# Method 1: Use defaults
terraform apply -auto-approve
# Creates message.txt with "Default message"

# Method 2: Override with -var flags
terraform apply -var="filename=custom.txt" -var="content=Custom content!" -auto-approve

# Method 3: Use variables file
echo 'filename = "vars.txt"' > prod.tfvars
echo 'content = "Production values"' >> prod.tfvars
terraform apply -var-file="prod.tfvars" -auto-approve

# Method 4: Environment variables (TF_VAR_name)
export TF_VAR_filename="env.txt"
export TF_VAR_content="From environment"
terraform apply -auto-approve

# View outputs
terraform output
# Shows:
# file_details = {
#   "name" = "env.txt"
#   "permissions" = "0644"
#   "size_bytes" = 19
# }
# created_at = "2024-11-30T10:00:00Z"


Scenario 3: Working with Lists and Maps

Managing multiple resources efficiently

sequenceDiagram
    participant User as Developer
    participant Var as Variables (list/map)
    participant TF as Terraform Core
    participant Resources as Multiple Resources
    participant State as State Management

    User->>Var: Define list of filenames
    User->>Var: Define map of file contents
    TF->>Resources: Create resources in loop
    Resources->>State: Track each resource
    TF->>User: Show count of resources created

    Note over Resources: for_each creates many from one definition

Code:

# Create multiple files dynamically

variable "files_map" {
  description = "Map of filenames to their contents"
  type        = map(string)
  default = {
    "readme.txt"    = "# Project README\nThis is auto-generated"
    "config.yaml"   = "app:\n  environment: production\n  version: 1.0"
    "license.txt"   = "MIT License\nCopyright 2024"
    "authors.md"    = "Authors:\n- DevOps Team"
  }
}

variable "permissions_map" {
  description = "Map of file extensions to permissions"
  type        = map(string)
  default = {
    "txt" = "0644"
    "md"  = "0644"
    "yaml" = "0600"
    "sh"   = "0755"
  }
}

# Create multiple resources using for_each
resource "local_file" "project_files" {
  # Loop through each entry in the map
  for_each = var.files_map

  # each.key is the filename, each.value is the content
  filename = "${path.module}/${each.key}"
  content  = each.value

  # Set permissions based on file extension
  file_permission = lookup(
    var.permissions_map, 
    split(".", each.key)[1], 
    "0644"
  )
}

# Create a single script file
resource "local_file" "setup_script" {
  filename = "${path.module}/setup.sh"
  content  = <<EOF
#!/bin/bash
# Auto-generated setup script

echo "Creating project structure..."
mkdir -p logs configs
touch logs/app.log
echo "Setup complete!"
EOF
  file_permission = "0755"
}

# Count how many files we created
output "total_files" {
  value = length(local_file.project_files)
}

# Show all created file paths
output "all_files" {
  value = [for f in local_file.project_files : f.filename]
}

# Example of conditional resource
resource "local_file" "optional_file" {
  count = var.create_debug_file ? 1 : 0

  filename = "${path.module}/debug.log"
  content  = "Debug mode enabled at ${timestamp()}"
}

variable "create_debug_file" {
  type    = bool
  default = false
}

Execution:

# Create all files
terraform apply -auto-approve

# Check created files
ls -la *.txt *.md *.yaml *.sh

# Should show:
# -rw-r--r-- readme.txt
# -rw------- config.yaml
# -rwxr-xr-x setup.sh
# etc.

# Try with debug file
terraform apply -var="create_debug_file=true" -auto-approve
ls debug.log

# Inspect state
terraform state list
# Shows:
# local_file.optional_file[0]
# local_file.project_files["authors.md"]
# local_file.project_files["config.yaml"]
# etc.


Scenario 4: Understanding State Management

How Terraform tracks your infrastructure

sequenceDiagram
    participant Dev as Developer
    participant TF as Terraform
    participant State as terraform.tfstate
    participant Lock as State Lock (.lock)
    participant Backup as State Backup

    Dev->>TF: terraform apply
    TF->>Lock: Create lock file
    Lock->>State: Prevent concurrent writes

    TF->>State: Read current state
    TF->>State: Compare with desired state
    TF->>State: Update with new resources

    State->>Backup: Create backup file
    Lock->>Lock: Remove lock file

    Note over State: State file is single source of truth

Code:

# main.tf
resource "local_file" "state_demo" {
  content  = "State management example"
  filename = "${path.module}/demo.txt"
}

# Backend configuration for remote state (AWS S3)
terraform {
  # This block configures where state is stored
  backend "s3" {
    bucket         = "my-terraform-state-bucket"
    key            = "beginner/state-demo.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }
}

# Data source: Read existing file
data "local_file" "existing" {
  filename = "${path.module}/demo.txt"

  # This doesn't create resource, just reads it
  depends_on = [local_file.state_demo]
}

# Output from data source
output "file_content" {
  value     = data.local_file.existing.content
  sensitive = false
}

# Manage state with commands
# terraform state show local_file.state_demo
# terraform state list
# terraform state mv local_file.old local_file.new
# terraform state rm local_file.unwanted

State management commands:

# Initialize with backend
terraform init

# Show current state
terraform show
# Shows JSON representation of all resources

# List all resources in state
terraform state list

# Show details of specific resource
terraform state show local_file.state_demo

# Simulate a problem: Manually delete the file
rm demo.txt

# Terraform detects drift (difference between state and reality)
terraform plan
# Shows: Plan: 1 to add, 0 to change, 0 to destroy.
# It wants to recreate the deleted file

# Refresh state without changing anything
terraform refresh

# Remove resource from state (but don't destroy it)
terraform state rm local_file.state_demo

# Import existing resource into state
# First, create the file manually
echo "Import me" > demo.txt

# Then import it
terraform import local_file.state_demo demo.txt

# Move resource to new address
terraform state mv local_file.state_demo local_file.renamed_demo

# Backup state manually
cp terraform.tfstate terraform.tfstate.backup

# Restore from backup if corrupted
cp terraform.tfstate.backup terraform.tfstate
terraform refresh


INTERMEDIATE LEVEL: Real Cloud Infrastructure

Scenario 5: Deploying an AWS EC2 Web Server

Complete web server with security group and networking

sequenceDiagram
    participant TF as Terraform
    participant AWS as AWS API
    participant SG as Security Group
    participant Key as SSH Key Pair
    participant EC2 as EC2 Instance
    participant User as Developer

    User->>TF: terraform apply
    TF->>AWS: Create SSH key pair
    AWS->>Key: Generate key material

    TF->>AWS: Create security group
    AWS->>SG: Allow HTTP (80) and SSH (22)

    TF->>AWS: Launch EC2 instance
    AWS->>EC2: Use Amazon Linux 2 AMI
    EC2->>SG: Attach security group
    EC2->>Key: Use SSH key

    AWS->>TF: Return instance details
    TF->>User: Show public IP & connection info

    Note over EC2: Web server is live at http://<public-ip>

Code:

# provider.tf - Configure AWS provider
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  # Backend for state storage
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "intermediate/web-server/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

provider "aws" {
  region = var.aws_region

  # Use shared credentials file (~/.aws/credentials)
  # Or specify access keys (not recommended for production)
  # access_key = var.aws_access_key
  # secret_key = var.aws_secret_key
}

# variables.tf
variable "aws_region" {
  description = "AWS region to deploy resources"
  type        = string
  default     = "us-east-1"
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"  # Free tier eligible
}

variable "key_name" {
  description = "Name of SSH key pair"
  type        = string
  default     = "web-server-key"
}

# data.tf - Get latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

# security.tf - Create security group
resource "aws_security_group" "web_server" {
  name_prefix = "web-server-"
  description = "Security group for web server"

  # Allow HTTP from anywhere
  ingress {
    description = "HTTP from internet"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Allow HTTPS from anywhere
  ingress {
    description = "HTTPS from internet"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Allow SSH from your IP (restrict this!)
  ingress {
    description = "SSH from office"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["203.0.113.0/24"]  # Replace with your IP
  }

  # Allow all outbound traffic
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "web-server-sg"
  }
}

# key-pair.tf - Generate SSH key
resource "aws_key_pair" "web_server" {
  key_name   = var.key_name
  public_key = tls_private_key.web_server.public_key_openssh
}

resource "tls_private_key" "web_server" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

# instance.tf - Create EC2 instance
resource "aws_instance" "web_server" {
  ami                         = data.aws_ami.amazon_linux.id
  instance_type               = var.instance_type
  key_name                    = aws_key_pair.web_server.key_name
  vpc_security_group_ids      = [aws_security_group.web_server.id]

  # User data to install web server on first boot
  user_data = <<-EOF
              #!/bin/bash
              yum update -y
              amazon-linux-extras install -y nginx1
              systemctl start nginx
              systemctl enable nginx
              echo "<h1>Hello from Terraform!</h1>" > /usr/share/nginx/html/index.html
              EOF

  # Prevent accidental termination
  disable_api_termination = false

  # Add tags for organization
  tags = {
    Name        = "terraform-web-server"
    Environment = "development"
    ManagedBy   = "terraform"
  }
}

# outputs.tf - Display important information
output "instance_id" {
  description = "ID of the EC2 instance"
  value       = aws_instance.web_server.id
}

output "public_ip" {
  description = "Public IP address of the instance"
  value       = aws_instance.web_server.public_ip
}

output "public_dns" {
  description = "Public DNS of the instance"
  value       = aws_instance.web_server.public_dns
}

output "ssh_command" {
  description = "Command to SSH into the instance"
  value       = "ssh -i ${aws_key_pair.web_server.key_name}.pem ec2-user@${aws_instance.web_server.public_ip}"
}

# Save private key (sensitive!)
resource "local_file" "private_key" {
  content         = tls_private_key.web_server.private_key_pem
  filename        = "${path.module}/${aws_key_pair.web_server.key_name}.pem"
  file_permission = "0600"  # Only owner can read/write

  sensitive       = true
}

Deployment workflow:

# 1. Configure AWS credentials
aws configure
# Enter AWS Access Key ID
# Enter AWS Secret Access Key
# Enter region: us-east-1
# Enter output format: json

# 2. Create S3 bucket for state storage (do this once)
aws s3 mb s3://my-terraform-state --region us-east-1
aws s3api put-bucket-encryption \
  --bucket my-terraform-state \
  --server-side-encryption-configuration '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"AES256"}}]}'

# Create DynamoDB table for locking
aws dynamodb create-table \
  --table-name terraform-locks \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region us-east-1

# 3. Initialize Terraform
terraform init

# 4. Plan the deployment
terraform plan

# 5. Apply (creates real AWS resources costing money)
terraform apply

# 6. Access the web server
# Outputs will show:
# public_ip = 203.0.113.45
# ssh_command = ssh -i web-server-key.pem ec2-user@203.0.113.45

# Open in browser: http://203.0.113.45

# 7. SSH into instance
chmod 600 web-server-key.pem
ssh -i web-server-key.pem ec2-user@203.0.113.45
# Verify Nginx is running: systemctl status nginx

# 8. Check AWS Console to see created resources
# - EC2 instance running
# - Security group with rules
# - Key pair registered

# 9. Destroy everything when done
terraform destroy
# Confirms deletion of all resources


Scenario 6: Creating and Using Terraform Modules

Organizing code into reusable components

sequenceDiagram
    participant Root as Root Module
    participant Module as Module (./modules/web-app)
    participant Resources as Module Resources
    participant State as Terraform State
    participant Registry as Terraform Registry

    Root->>Module: Call with variables
    Module->>Resources: Create SG, EC2, EBS
    Resources->>State: Store all resources

    Module->>Root: Return outputs (IP, DNS)

    Root->>Registry: Can publish module
    Registry->>Other: Reuse in other projects

    Note over Module: Self-contained, reusable infrastructure

Code - Module Structure:

project/
├── main.tf
├── variables.tf
├── outputs.tf
└── modules/
    └── web-app/
        ├── main.tf
        ├── variables.tf
        ├── outputs.tf
        └── README.md

modules/web-app/main.tf

# This is a reusable module for deploying a web application

resource "aws_security_group" "app" {
  name        = "${var.app_name}-sg"
  description = "Security group for ${var.app_name}"
  vpc_id      = var.vpc_id

  # Dynamic ingress rules based on input
  dynamic "ingress" {
    for_each = var.allowed_ports
    content {
      from_port   = ingress.value
      to_port     = ingress.value
      protocol    = "tcp"
      cidr_blocks = var.allowed_cidrs
    }
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "${var.app_name}-sg"
  }
}

resource "aws_instance" "app" {
  count = var.instance_count

  ami                    = var.ami_id
  instance_type          = var.instance_type
  key_name               = var.key_name
  vpc_security_group_ids = [aws_security_group.app.id]
  subnet_id              = var.subnet_ids[count.index % length(var.subnet_ids)]

  root_block_device {
    volume_type = "gp3"
    volume_size = var.root_volume_size
    encrypted   = true
  }

  ebs_block_device {
    device_name = "/dev/sdf"
    volume_type = "gp3"
    volume_size = var.data_volume_size
    encrypted   = true
  }

  user_data = base64encode(templatefile("${path.module}/user_data.sh", {
    app_name     = var.app_name
    environment  = var.environment
    app_version  = var.app_version
  }))

  tags = merge(
    var.tags,
    {
      Name = "${var.app_name}-${count.index + 1}"
    }
  )
}

resource "aws_eip" "app" {
  count = var.assign_eip ? var.instance_count : 0

  instance = aws_instance.app[count.index].id
  domain   = "vpc"

  tags = {
    Name = "${var.app_name}-eip-${count.index + 1}"
  }
}

data "template_file" "user_data" {
  template = file("${path.module}/user_data.sh")
}

modules/web-app/variables.tf

variable "app_name" {
  description = "Name of the application"
  type        = string
}

variable "environment" {
  description = "Environment name"
  type        = string
  default     = "dev"
}

variable "vpc_id" {
  description = "VPC ID"
  type        = string
}

variable "subnet_ids" {
  description = "List of subnet IDs"
  type        = list(string)
}

variable "ami_id" {
  description = "AMI ID for instances"
  type        = string
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}

variable "instance_count" {
  description = "Number of instances"
  type        = number
  default     = 2
}

variable "key_name" {
  description = "SSH key pair name"
  type        = string
}

variable "allowed_ports" {
  description = "List of allowed ports"
  type        = list(number)
  default     = [22, 80, 443]
}

variable "allowed_cidrs" {
  description = "List of allowed CIDR blocks"
  type        = list(string)
  default     = ["0.0.0.0/0"]
}

variable "root_volume_size" {
  description = "Root volume size in GB"
  type        = number
  default     = 20
}

variable "data_volume_size" {
  description = "Data volume size in GB"
  type        = number
  default     = 50
}

variable "app_version" {
  description = "Application version"
  type        = string
  default     = "latest"
}

variable "assign_eip" {
  description = "Assign Elastic IPs"
  type        = bool
  default     = false
}

variable "tags" {
  description = "Additional tags"
  type        = map(string)
  default     = {}
}

modules/web-app/outputs.tf

output "instance_ids" {
  description = "IDs of EC2 instances"
  value       = aws_instance.app[*].id
}

output "public_ips" {
  description = "Public IPs of instances"
  value       = aws_eip.app[*].public_ip
}

output "private_ips" {
  description = "Private IPs of instances"
  value       = aws_instance.app[*].private_ip
}

output "security_group_id" {
  description = "Security group ID"
  value       = aws_security_group.app.id
}

modules/web-app/user_data.sh

#!/bin/bash -xe

# User data script passed to EC2 instances

# Update system
yum update -y

# Install Docker
amazon-linux-extras install -y docker
systemctl start docker
systemctl enable docker

# Install app
mkdir -p /opt/${app_name}
cd /opt/${app_name}

# Pull application
docker pull myorg/${app_name}:${app_version}

# Run container
docker run -d \
  --name ${app_name} \
  -p 80:8080 \
  -e ENVIRONMENT=${environment} \
  -e VERSION=${app_version} \
  myorg/${app_name}:${app_version}

# Setup CloudWatch logging
yum install -y awslogs
systemctl start awslogsd
systemctl enable awslogsd

Root main.tf (using the module)

# main.tf - Root module that calls our reusable web-app module

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "intermediate/web-app-module/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

provider "aws" {
  region = var.region
}

# Get VPC and subnets data
data "aws_vpc" "default" {
  default = true
}

data "aws_subnets" "default" {
  filter {
    name   = "vpc-id"
    values = [data.aws_vpc.default.id]
  }
}

# Generate SSH key for this deployment
resource "tls_private_key" "app" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "aws_key_pair" "app" {
  key_name   = "${var.project}-${var.environment}-key"
  public_key = tls_private_key.app.public_key_openssh
}

resource "local_file" "private_key" {
  content         = tls_private_key.app.private_key_pem
  filename        = "${path.module}/${aws_key_pair.app.key_name}.pem"
  file_permission = "0600"
  sensitive       = true
}

# Deploy web app module for production
module "production_web_app" {
  source = "./modules/web-app"

  app_name     = "my-web-app"
  environment  = "production"
  vpc_id       = data.aws_vpc.default.id
  subnet_ids   = data.aws_subnets.default.ids
  ami_id       = data.aws_ami.amazon_linux.id
  instance_type = "t3.medium"
  instance_count = 3
  key_name     = aws_key_pair.app.key_name

  allowed_ports = [22, 80, 443, 8080]
  allowed_cidrs = ["203.0.113.0/24"]  # Your office IP

  root_volume_size = 30
  data_volume_size = 100
  app_version      = "2.1.0"
  assign_eip       = true

  tags = {
    Project     = var.project
    CostCenter  = var.cost_center
    ManagedBy   = "terraform"
  }
}

# Deploy dev version (smaller, cheaper)
module "development_web_app" {
  source = "./modules/web-app"

  app_name     = "my-web-app"
  environment  = "development"
  vpc_id       = data.aws_vpc.default.id
  subnet_ids   = data.aws_subnets.default.ids
  ami_id       = data.aws_ami.amazon_linux.id
  instance_count = 1
  key_name     = aws_key_pair.app.key_name

  allowed_cidrs = ["0.0.0.0/0"]  # Open for dev

  app_version  = var.app_version
  assign_eip   = false

  tags = {
    Environment = "dev"
    CostCenter  = var.cost_center
  }
}

# Data source for AMI
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

Root variables.tf

variable "region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "project" {
  description = "Project name"
  type        = string
  default     = "web-app-project"
}

variable "environment" {
  description = "Environment name"
  type        = string
  default     = "production"
}

variable "cost_center" {
  description = "Cost center for tracking"
  type        = string
  default     = "engineering"
}

variable "app_version" {
  description = "Application version"
  type        = string
  default     = "latest"
}

Execution:

# 1. Initialize (downloads module dependencies)
terraform init

# 2. Plan both environments
terraform plan

# 3. Apply (creates 4 instances: 3 prod, 1 dev)
terraform apply

# 4. Access outputs
terraform output production_web_app_public_ips
# ["203.0.113.45", "203.0.113.46", "203.0.113.47"]

terraform output development_web_app_private_ips

# 5. Test production load balancer
# Install a load balancer in front:
# - Use AWS ALB target group with these instances
# - Or use module output to configure external LB

# 6. Update module (change instance type)
# Edit module call, then:
terraform plan -target="module.production_web_app"

# 7. Destroy dev environment only
terraform destroy -target="module.development_web_app"

# 8. Publish module to registry
# Tag version:
cd modules/web-app
git tag v1.0.0
git push origin v1.0.0

# Use from registry:
module "web_app" {
  source  = "app.terraform.io/myorg/web-app/aws"
  version = "1.0.0"

  # ... configuration
}


Scenario 7: Remote State & Data Sources

Sharing state between Terraform configurations

sequenceDiagram
    participant Network as Network Team
    participant App as App Team
    participant State as Remote State (S3)
    participant Data as Data Sources
    participant AWS as AWS Resources

    Network->>AWS: Create VPC, subnets
    Network->>State: Store network state

    App->>Data: Read network state
    Data->>State: Fetch VPC/subnet IDs
    App->>AWS: Deploy app in existing network

    Note over State: Single source of truth across teams

Code - Network Team (creates shared infrastructure):

# network-team/main.tf
terraform {
  backend "s3" {
    bucket         = "shared-terraform-state"
    key            = "network/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
  }
}

resource "aws_vpc" "shared" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name        = "shared-vpc"
    ManagedBy   = "network-team"
  }
}

resource "aws_subnet" "public" {
  count = 3

  vpc_id            = aws_vpc.shared.id
  cidr_block        = "10.0.${count.index + 1}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name = "public-subnet-${count.index + 1}"
    Tier = "public"
  }
}

resource "aws_subnet" "private" {
  count = 3

  vpc_id            = aws_vpc.shared.id
  cidr_block        = "10.0.${count.index + 11}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name = "private-subnet-${count.index + 1}"
    Tier = "private"
  }
}

resource "aws_internet_gateway" "shared" {
  vpc_id = aws_vpc.shared.id

  tags = {
    Name = "shared-igw"
  }
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.shared.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.shared.id
  }

  tags = {
    Name = "public-route-table"
  }
}

resource "aws_route_table_association" "public" {
  count = 3
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

# Outputs that other teams will consume
output "vpc_id" {
  description = "Shared VPC ID"
  value       = aws_vpc.shared.id
}

output "public_subnet_ids" {
  description = "List of public subnet IDs"
  value       = aws_subnet.public[*].id
}

output "private_subnet_ids" {
  description = "List of private subnet IDs"
  value       = aws_subnet.private[*].id
}

output "vpc_cidr" {
  description = "VPC CIDR block"
  value       = aws_vpc.shared.cidr_block
}

Code - Application Team (consumes shared network):

# app-team/main.tf
terraform {
  backend "s3" {
    bucket         = "shared-terraform-state"
    key            = "apps/webapp/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
  }
}

# Data source to read network team's state
data "terraform_remote_state" "network" {
  backend = "s3"

  config = {
    bucket = "shared-terraform-state"
    key    = "network/terraform.tfstate"
    region = "us-east-1"
  }
}

# Use data sources to query AWS
data "aws_security_groups" "default" {
  filter {
    name   = "group-name"
    values = ["default"]
  }

  filter {
    name   = "vpc-id"
    values = [data.terraform_remote_state.network.outputs.vpc_id]
  }
}

# Deploy application in existing network
module "web_app" {
  source = "./modules/web-app"

  app_name     = "customer-portal"
  environment  = "production"

  # Use shared network resources
  vpc_id      = data.terraform_remote_state.network.outputs.vpc_id
  subnet_ids  = data.terraform_remote_state.network.outputs.public_subnet_ids

  # Other configuration...
  instance_count = 3
  key_name       = aws_key_pair.app.key_name
}

# Create security group referencing shared VPC
resource "aws_security_group" "app" {
  name_prefix = "customer-portal-"
  vpc_id      = data.terraform_remote_state.network.outputs.vpc_id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "customer-portal-sg"
  }
}

# Create database in private subnets
resource "aws_db_subnet_group" "app" {
  name       = "customer-portal-db-subnet-group"
  subnet_ids = data.terraform_remote_state.network.outputs.private_subnet_ids

  tags = {
    Name = "Customer Portal DB Subnet Group"
  }
}

resource "aws_db_instance" "app" {
  identifier          = "customer-portal-prod"
  engine              = "postgres"
  engine_version      = "15.3"
  instance_class      = "db.t3.medium"
  allocated_storage   = 100
  db_subnet_group_name = aws_db_subnet_group.app.name
  vpc_security_group_ids = [aws_security_group.app.id]

  # Database credentials (use secrets in production!)
  db_name  = "customerportal"
  username = "dbadmin"
  password = var.db_password

  backup_retention_period = 7
  backup_window          = "03:00-04:00"

  tags = {
    Name        = "customer-portal-db"
    Environment = "production"
  }
}


Scenario 8: Terraform Workspaces for Environments

Managing dev, staging, production with same configuration

sequenceDiagram
    participant Dev as Developer
    participant TF as Terraform Workspaces
    participant StateDev as State: dev
    participant StateStage as State: staging
    participant StateProd as State: prod
    participant AWS as AWS Resources

    Dev->>TF: workspace new dev
    TF->>StateDev: Create dev.tfstate

    Dev->>TF: workspace new staging
    TF->>StateStage: Create staging.tfstate

    Dev->>TF: workspace new prod
    TF->>StateProd: Create prod.tfstate

    Dev->>TF: workspace select dev
    TF->>StateDev: Switch context
    Dev->>AWS: Deploy dev resources

    Dev->>TF: workspace select prod
    TF->>StateProd: Switch context
    Dev->>AWS: Deploy prod resources

    Note over TF: Same code, isolated states

Code:

# main.tf
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "workspaces/app/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    # Workspace key prefix enables multiple states
    workspace_key_prefix = "workspaces"
  }
}

provider "aws" {
  region = var.aws_region
}

# Tagging strategy based on workspace
locals {
  environment = terraform.workspace
  common_tags = {
    Environment = local.environment
    ManagedBy   = "terraform"
    Project     = var.project_name
  }
}

# Choose instance size based on environment
locals {
  instance_config = {
    dev = {
      type  = "t3.micro"
      count = 1
    }
    staging = {
      type  = "t3.small"
      count = 2
    }
    prod = {
      type  = "t3.medium"
      count = 3
    }
  }

  current_config = lookup(local.instance_config, local.environment, local.instance_config["dev"])
}

resource "aws_instance" "web" {
  count = local.current_config.count

  ami           = data.aws_ami.amazon_linux.id
  instance_type = local.current_config.type

  tags = merge(local.common_tags, {
    Name = "${var.project_name}-${local.environment}-${count.index + 1}"
  })
}

# Different CIDR blocks per environment
variable "vpc_cidrs" {
  type = map(string)
  default = {
    dev     = "10.0.0.0/16"
    staging = "10.1.0.0/16"
    prod    = "10.2.0.0/16"
  }
}

resource "aws_vpc" "main" {
  cidr_block = lookup(var.vpc_cidrs, local.environment, "10.0.0.0/16")

  tags = merge(local.common_tags, {
    Name = "${var.project_name}-vpc"
  })
}

# Environment-specific cost allocations
variable "cost_centers" {
  type = map(string)
  default = {
    dev     = "dev-team"
    staging = "qa-team"
    prod    = "production"
  }
}

resource "aws_tag" "cost_allocation" {
  resource_id = aws_vpc.main.id
  key         = "CostCenter"
  value       = lookup(var.cost_centers, local.environment, "unknown")
}

# Data source to get current workspace info
data "terraform_remote_state" "network" {
  backend = "s3"

  config = {
    bucket = "my-terraform-state"
    key    = "workspaces/app/terraform.tfstate"
    region = "us-east-1"
  }

  workspace = local.environment
}

Workspace commands:

# List available workspaces
terraform workspace list
# * default

# Create new workspace
terraform workspace new dev
# Created and switched to workspace "dev"!

# Create staging workspace
terraform workspace new staging

# Create production workspace
terraform workspace new prod

# List again
terraform workspace list
#   default
# * dev
#   prod
#   staging

# Switch workspace
terraform workspace select prod
# Switched to workspace "prod"!

# Show current workspace
terraform workspace show
# prod

# Plan for dev
terraform workspace select dev
terraform plan -var="project_name=myapp"

# Plan for prod (different resources)
terraform workspace select prod
terraform plan -var="project_name=myapp"

# State files in S3 will be:
# s3://my-terraform-state/workspaces/dev/app/terraform.tfstate
# s3://my-terraform-state/workspaces/prod/app/terraform.tfstate

# Delete a workspace (must be empty first)
terraform workspace select default
terraform workspace delete dev

# Use in automation
if [ "${CI_ENVIRONMENT_NAME}" == "production" ]; then
  terraform workspace select prod
  terraform apply -auto-approve
else
  terraform workspace select dev
  terraform apply -auto-approve
fi


ADVANCED LEVEL: Production-Ready Patterns

Scenario 9: Multi-Tier Architecture with Load Balancer

Complete production stack: ALB, ASG, RDS, ElastiCache

sequenceDiagram
    participant Client as User
    participant ALB as Application LB
    participant ASG as Auto Scaling Group
    participant EC2 as EC2 Instances
    participant RDS as RDS Database
    participant Cache as ElastiCache
    participant S3 as S3 Bucket
    participant TF as Terraform

    Client->>ALB: HTTPS request
    ALB->>ASG: Distribute traffic
    ASG->>EC2: Launch 2-10 instances
    EC2->>RDS: Query database
    EC2->>Cache: Cache session data
    EC2->>S3: Store uploads

    TF->>ALB: Configure listener rules
    TF->>ASG: Set scaling policies
    TF->>RDS: Create Postgres cluster
    TF->>Cache: Create Redis cluster

    Note over ASG: Health checks & auto-healing

Code:

# main.tf - Complete multi-tier architecture

terraform {
  required_version = ">= 1.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  backend "s3" {
    bucket         = "prod-terraform-state"
    key            = "advanced/multi-tier/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "prod-terraform-locks"
    encrypt        = true
  }
}

provider "aws" {
  region = var.aws_region

  default_tags {
    tags = {
      Project     = var.project_name
      Environment = var.environment
      ManagedBy   = "terraform"
    }
  }
}

# Variables
variable "project_name" {
  type    = string
  default = "multi-tier-app"
}

variable "environment" {
  type    = string
  default = "production"
}

variable "aws_region" {
  type    = string
  default = "us-east-1"
}

variable "app_version" {
  type    = string
  default = "v2.5.0"
}

# Data sources
data "aws_availability_zones" "available" {
  state = "available"
}

data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

data "aws_elb_service_account" "main" {}

# VPC and networking
module "vpc" {
  source = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"

  name = "${var.project_name}-vpc"
  cidr = "10.0.0.0/16"

  azs             = slice(data.aws_availability_zones.available.names, 0, 3)
  public_subnets  = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  private_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
  database_subnets = ["10.0.201.0/24", "10.0.202.0/24"]

  enable_nat_gateway = true
  single_nat_gateway = false
  enable_vpn_gateway = true

  tags = {
    "kubernetes.io/cluster/${var.project_name}-eks" = "shared"
  }

  public_subnet_tags = {
    "kubernetes.io/role/elb" = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/role/internal-elb" = "1"
  }
}

# S3 bucket for assets
resource "aws_s3_bucket" "assets" {
  bucket = "${var.project_name}-assets-${data.aws_caller_identity.current.account_id}"
}

resource "aws_s3_bucket_versioning" "assets" {
  bucket = aws_s3_bucket.assets.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_public_access_block" "assets" {
  bucket = aws_s3_bucket.assets.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

# S3 bucket for logs
resource "aws_s3_bucket" "logs" {
  bucket = "${var.project_name}-logs-${data.aws_caller_identity.current.account_id}"
}

resource "aws_s3_bucket_policy" "logs" {
  bucket = aws_s3_bucket.logs.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "ELBWriteAccess"
        Effect    = "Allow"
        Principal = {
          AWS = "arn:aws:iam::${data.aws_elb_service_account.main.arn}:root"
        }
        Action    = "s3:PutObject"
        Resource  = "${aws_s3_bucket.logs.arn}/logs/alb/*"
      },
    ]
  })
}

# Application Load Balancer
resource "aws_lb" "main" {
  name               = "${var.project_name}-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = module.vpc.public_subnets

  enable_deletion_protection = false

  access_logs {
    bucket  = aws_s3_bucket.logs.bucket
    prefix  = "logs/alb"
    enabled = true
  }

  tags = {
    Name = "${var.project_name}-alb"
  }
}

resource "aws_lb_target_group" "app" {
  name     = "${var.project_name}-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = module.vpc.vpc_id
  target_type = "instance"

  health_check {
    enabled             = true
    healthy_threshold   = 3
    unhealthy_threshold = 3
    timeout             = 5
    interval            = 30
    path                = "/health"
    matcher             = "200-299"
  }

  stickiness {
    type = "lb_cookie"
    cookie_duration = 86400
  }
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port              = "443"
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-2016-08"
  certificate_arn   = aws_acm_certificate.main.arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.app.arn
  }
}

resource "aws_lb_listener" "http_redirect" {
  load_balancer_arn = aws_lb.main.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type = "redirect"

    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}

# ACM Certificate
resource "aws_acm_certificate" "main" {
  domain_name       = "*.example.com"
  validation_method = "DNS"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_route53_record" "cert_validation" {
  for_each = {
    for dvo in aws_acm_certificate.main.domain_validation_options : dvo.domain_name => {
      name   = dvo.resource_record_name
      record = dvo.resource_record_value
      type   = dvo.resource_record_type
    }
  }

  allow_overwrite = true
  name            = each.value.name
  records         = [each.value.record]
  ttl             = 60
  type            = each.value.type
  zone_id         = data.aws_route53_zone.main.zone_id
}

resource "aws_acm_certificate_validation" "main" {
  certificate_arn = aws_acm_certificate.main.arn

  validation_record_fqdns = [for record in aws_route53_record.cert_validation : record.fqdn]
}

# Launch Template for Auto Scaling
resource "aws_launch_template" "app" {
  name_prefix   = "${var.project_name}-"
  image_id      = data.aws_ami.amazon_linux.id
  instance_type = var.instance_type
  key_name      = aws_key_pair.app.key_name

  network_interfaces {
    security_groups = [aws_security_group.app.id]
    subnet_id       = module.vpc.public_subnets[0]
  }

  block_device_mappings {
    device_name = "/dev/xvda"

    ebs {
      volume_size           = 20
      volume_type           = "gp3"
      encrypted             = true
      delete_on_termination = true
    }
  }

  user_data = base64encode(templatefile("${path.module}/user_data.sh", {
    app_version = var.app_version
    log_group   = aws_cloudwatch_log_group.app.name
    region      = var.aws_region
  }))

  tag_specifications {
    resource_type = "instance"
    tags = {
      Name = var.project_name
    }
  }

  lifecycle {
    create_before_destroy = true
  }
}

# Auto Scaling Group
resource "aws_autoscaling_group" "app" {
  name                = "${var.project_name}-asg"
  vpc_zone_identifier = module.vpc.public_subnets
  target_group_arns   = [aws_lb_target_group.app.arn]
  health_check_type   = "ELB"
  min_size            = 2
  max_size            = 10
  desired_capacity    = 3

  launch_template {
    id      = aws_launch_template.app.id
    version = "$Latest"
  }

  termination_policies = ["OldestLaunchTemplate", "ClosestToNextInstanceHour"]

  tag {
    key                 = "Name"
    value               = var.project_name
    propagate_at_launch = true
  }

  lifecycle {
    create_before_destroy = true
  }
}

# Auto Scaling Policies
resource "aws_autoscaling_policy" "cpu_scale_up" {
  name                   = "${var.project_name}-cpu-scale-up"
  autoscaling_group_name = aws_autoscaling_group.app.name

  policy_type = "TargetTrackingScaling"
  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ASGAverageCPUUtilization"
    }
    target_value = 60.0
  }
}

resource "aws_autoscaling_policy" "cpu_scale_down" {
  name                   = "${var.project_name}-cpu-scale-down"
  autoscaling_group_name = aws_autoscaling_group.app.name

  policy_type = "TargetTrackingScaling"
  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ASGAverageCPUUtilization"
    }
    target_value = 30.0
  }
}

# RDS PostgreSQL Database
resource "aws_db_subnet_group" "app" {
  name       = "${var.project_name}-db-subnet-group"
  subnet_ids = module.vpc.database_subnets

  tags = {
    Name = "${var.project_name}-db-subnet-group"
  }
}

resource "aws_db_parameter_group" "app" {
  name   = "${var.project_name}-db-params"
  family = "postgres15"

  parameter {
    name  = "log_connections"
    value = "1"
  }

  parameter {
    name  = "log_disconnections"
    value = "1"
  }

  parameter {
    name  = "log_duration"
    value = "1"
  }
}

resource "aws_db_instance" "app" {
  identifier             = "${var.project_name}-db"
  engine                 = "postgres"
  engine_version         = "15.3"
  instance_class         = "db.t3.medium"
  allocated_storage      = 100
  storage_type           = "gp3"
  storage_encrypted      = true

  db_name  = var.db_name
  username = var.db_username
  password = var.db_password

  db_subnet_group_name   = aws_db_subnet_group.app.name
  vpc_security_group_ids = [aws_security_group.rds.id]
  parameter_group_name   = aws_db_parameter_group.app.name

  backup_retention_period = 7
  backup_window          = "03:00-04:00"
  maintenance_window     = "sun:04:00-sun:05:00"

  skip_final_snapshot = false
  final_snapshot_identifier = "${var.project_name}-db-final-${formatdate("YYYYMMDDhhmmss", timestamp())}"

  tags = {
    Name = "${var.project_name}-db"
  }
}

# ElastiCache Redis
resource "aws_elasticache_subnet_group" "app" {
  name       = "${var.project_name}-cache-subnet-group"
  subnet_ids = module.vpc.private_subnets
}

resource "aws_elasticache_cluster" "app" {
  cluster_id           = "${var.project_name}-cache"
  engine               = "redis"
  node_type            = "cache.t3.medium"
  num_cache_nodes      = 1
  parameter_group_name = "default.redis7"
  engine_version       = "7.0"
  port                 = 6379

  subnet_group_name = aws_elasticache_subnet_group.app.name
  security_group_ids = [aws_security_group.cache.id]

  tags = {
    Name = "${var.project_name}-cache"
  }
}

# CloudWatch Log Group
resource "aws_cloudwatch_log_group" "app" {
  name              = "/${var.project_name}/app"
  retention_in_days = 30

  tags = {
    Name = "${var.project_name}-logs"
  }
}

# IAM Role for instances
resource "aws_iam_role" "app" {
  name = "${var.project_name}-instance-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy" "app" {
  name = "${var.project_name}-instance-policy"
  role = aws_iam_role.app.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = [
          "logs:CreateLogStream",
          "logs:PutLogEvents",
          "logs:CreateLogGroup"
        ]
        Resource = "arn:aws:logs:*:*:*"
      },
      {
        Effect   = "Allow"
        Action   = [
          "s3:GetObject",
          "s3:PutObject"
        ]
        Resource = [
          "${aws_s3_bucket.assets.arn}/*",
          "${aws_s3_bucket.logs.arn}/*"
        ]
      }
    ]
  })
}

resource "aws_iam_instance_profile" "app" {
  name = "${var.project_name}-instance-profile"
  role = aws_iam_role.app.name
}

# Security Groups
resource "aws_security_group" "alb" {
  name        = "${var.project_name}-alb-sg"
  description = "ALB Security Group"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "app" {
  name        = "${var.project_name}-app-sg"
  description = "Application Security Group"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "rds" {
  name        = "${var.project_name}-rds-sg"
  description = "RDS Security Group"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "cache" {
  name        = "${var.project_name}-cache-sg"
  description = "ElastiCache Security Group"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port       = 6379
    to_port         = 6379
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
  }
}

# TLS Private Key
resource "tls_private_key" "app" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "aws_key_pair" "app" {
  key_name   = "${var.project_name}-${var.environment}-key"
  public_key = tls_private_key.app.public_key_openssh
}

resource "local_file" "private_key" {
  content         = tls_private_key.app.private_key_pem
  filename        = "${path.module}/${aws_key_pair.app.key_name}.pem"
  file_permission = "0600"
}

# Route53 Zone (assumes zone exists)
data "aws_route53_zone" "main" {
  name         = var.domain_name
  private_zone = false
}

# CloudWatch Alarms
resource "aws_cloudwatch_metric_alarm" "cpu_high" {
  alarm_name          = "${var.project_name}-cpu-high"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = "300"
  statistic           = "Average"
  threshold           = "80"
  alarm_description   = "CPU utilization is high"

  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.app.name
  }

  alarm_actions = [aws_sns_topic.alerts.arn]
}

resource "aws_sns_topic" "alerts" {
  name = "${var.project_name}-alerts"
}

# Variables
variable "domain_name" {
  type    = string
  default = "example.com"
}

variable "instance_type" {
  type    = string
  default = "t3.micro"
}

variable "db_name" {
  type    = string
  default = "myappdb"
}

variable "db_username" {
  type      = string
  sensitive = true
}

variable "db_password" {
  type      = string
  sensitive = true
}

user_data.sh

#!/bin/bash -xe

# Install CloudWatch agent
yum install -y awslogs

# Configure CloudWatch
cat > /etc/awslogs/awslogs.conf <<EOF
[general]
state_file = /var/lib/awslogs/agent-state

[/var/log/messages]
file = /var/log/messages
log_group_name = ${log_group}
log_stream_name = {instance_id}/var/log/messages

[/var/log/docker]
file = /var/log/docker
log_group_name = ${log_group}
log_stream_name = {instance_id}/var/log/docker
EOF

systemctl start awslogsd
systemctl enable awslogsd

# Start application
docker run -d \
  --name app \
  -p 80:8080 \
  -e DB_HOST=${db_host} \
  -e DB_USER=${db_user} \
  -e DB_PASS=${db_pass} \
  -e REDIS_HOST=${redis_host} \
  -e VERSION=${app_version} \
  myorg/app:${app_version}

Execution:

# Initialize with module dependencies
terraform init

# Set database credentials
export TF_VAR_db_username="admin"
export TF_VAR_db_password="VeryStrongPassword123!"

# Plan infrastructure (review costs)
terraform plan

# Apply with approval
terraform apply

# After completion, test:
# 1. Access ALB DNS name: https://multi-tier-app-alb-123456789.us-east-1.elb.amazonaws.com
# 2. Check health status: /health
# 3. SSH to instance: ssh -i multi-tier-app-production-key.pem ec2-user@<instance-ip>
# 4. Check logs: docker logs app
# 5. Database connectivity: psql -h <db-endpoint>

# Simulate high CPU to test autoscaling
# stress-ng --cpu 4 --timeout 600

# Monitor in AWS Console:
# - CloudWatch metrics
# - ALB target health
# - RDS performance
# - ElastiCache metrics

# Costs: ~$200/month (t3.medium DB, t3.micro instances, ALB)

# Destroy when done
terraform destroy


Scenario 10: Custom Provider Development

Extending Terraform with a custom provider

sequenceDiagram
    participant Dev as Provider Developer
    participant SDK as Terraform Plugin SDK
    participant API as Custom API
    participant Schema as Provider Schema
    participant Resource as Resource Implementation
    participant TF as Terraform CLI
    participant User as End User

    Dev->>SDK: Implement Provider interface
    Dev->>SDK: Define resource schema
    SDK->>Schema: Generate CRUD callbacks

    User->>TF: terraform init
    TF->>Dev: Download custom provider
    User->>TF: terraform apply

    TF->>Resource: Call Create()
    Resource->>API: POST /api/resource
    API->>Resource: Return resource data
    Resource->>TF: Set resource ID

    TF->>User: Resource created successfully

    Note over Dev: Go programming required

Code - Custom Provider Skeleton:

// main.go - Entry point for custom provider
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/hashicorp/terraform-plugin-sdk/v2/diag"
    "github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema"
    "github.com/hashicorp/terraform-plugin-sdk/v2/plugin"
)

func main() {
    plugin.Serve(&plugin.ServeOpts{
        ProviderFunc: func() *schema.Provider {
            return &schema.Provider{
                Schema: map[string]*schema.Schema{
                    "api_key": {
                        Type:        schema.TypeString,
                        Required:    true,
                        DefaultFunc: schema.EnvDefaultFunc("CUSTOM_API_KEY", nil),
                        Description: "API key for the custom service",
                    },
                    "base_url": {
                        Type:        schema.TypeString,
                        Optional:    true,
                        Default:     "https://api.custom-service.com",
                        Description: "Base URL for the API",
                    },
                },
                ResourcesMap: map[string]*schema.Resource{
                    "custom_server":   resourceCustomServer(),
                    "custom_database": resourceCustomDatabase(),
                },
                DataSourcesMap: map[string]*schema.Resource{
                    "custom_region": dataSourceCustomRegion(),
                },
                ConfigureContextFunc: providerConfigure,
            }
        },
    })
}

func providerConfigure(ctx context.Context, d *schema.ResourceData) (interface{}, diag.Diagnostics) {
    apiKey := d.Get("api_key").(string)
    baseURL := d.Get("base_url").(string)

    // Initialize API client
    client := NewClient(apiKey, baseURL)

    return client, nil
}

// resource_custom_server.go
func resourceCustomServer() *schema.Resource {
    return &schema.Resource{
        CreateContext: rw6jYohNVBeVZFxAjEUHqSbjor6i8pNm4h,
        ReadContext:   resourceCustomServerRead,
        UpdateContext: rw6jYohNVBeVZFxAjEUHqSbjor6i8pNm4h,
        DeleteContext: resourceCustomServerDelete,
        Schema: map[string]*schema.Schema{
            "name": {
                Type:     schema.TypeString,
                Required: true,
                ForceNew: true,
            },
            "region": {
                Type:     schema.TypeString,
                Required: true,
            },
            "size": {
                Type:     schema.TypeString,
                Optional: true,
                Default:  "small",
            },
            "status": {
                Type:     schema.TypeString,
                Computed: true,
            },
            "ip_address": {
                Type:     schema.TypeString,
                Computed: true,
            },
            "metadata": {
                Type:     schema.TypeMap,
                Optional: true,
                Elem:     &schema.Schema{Type: schema.TypeString},
            },
        },
    }
}

func rw6jYohNVBeVZFxAjEUHqSbjor6i8pNm4h(ctx context.Context, d *schema.ResourceData, meta interface{}) diag.Diagnostics {
    client := meta.(*Client)
    name := d.Get("name").(string)

    // Create API request
    server := &Server{
        Name:   name,
        Region: d.Get("region").(string),
        Size:   d.Get("size").(string),
        Metadata: expandMap(d.Get("metadata")),
    }

    // Call API
    created, err := client.CreateServer(ctx, server)
    if err != nil {
        return diag.FromErr(fmt.Errorf("error creating server %s: %w", name, err))
    }

    // Set resource ID
    d.SetId(created.ID)

    // Wait for server to be ready
    if err := waitForServerReady(ctx, client, created.ID); err != nil {
        return diag.FromErr(err)
    }

    return resourceCustomServerRead(ctx, d, meta)
}

func resourceCustomServerRead(ctx context.Context, d *schema.ResourceData, meta interface{}) diag.Diagnostics {
    client := meta.(*Client)

    // Get server from API
    server, err := client.GetServer(ctx, d.Id())
    if err != nil {
        if isNotFoundError(err) {
            d.SetId("")
            return nil
        }
        return diag.FromErr(err)
    }

    // Update state
    d.Set("name", server.Name)
    d.Set("region", server.Region)
    d.Set("size", server.Size)
    d.Set("status", server.Status)
    d.Set("ip_address", server.IPAddress)
    d.Set("metadata", server.Metadata)

    return nil
}

// client.go - API client implementation
package main

import (
    "context"
    "encoding/json"
    "fmt"
    "net/http"
)

type Client struct {
    apiKey  string
    baseURL string
    httpClient *http.Client
}

type Server struct {
    ID        string            `json:"id"`
    Name      string            `json:"name"`
    Region    string            `json:"region"`
    Size      string            `json:"size"`
    Status    string            `json:"status"`
    IPAddress string            `json:"ip_address"`
    Metadata  map[string]string `json:"metadata"`
}

func NewClient(apiKey, baseURL string) *Client {
    return &Client{
        apiKey:  apiKey,
        baseURL: baseURL,
        httpClient: &http.Client{},
    }
}

func (c *Client) CreateServer(ctx context.Context, server *Server) (*Server, error) {
    body, err := json.Marshal(server)
    if err != nil {
        return nil, err
    }

    req, err := http.NewRequestWithContext(ctx, "POST", 
        fmt.Sprintf("%s/api/v1/servers", c.baseURL), bytes.NewReader(body))
    if err != nil {
        return nil, err
    }

    req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", c.apiKey))
    req.Header.Set("Content-Type", "application/json")

    resp, err := c.httpClient.Do(req)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusCreated {
        return nil, fmt.Errorf("unexpected status code: %d", resp.StatusCode)
    }

    var created Server
    if err := json.NewDecoder(resp.Body).Decode(&created); err != nil {
        return nil, err
    }

    return &created, nil
}

// utils.go - Helper functions
func expandMap(v interface{}) map[string]string {
    if v == nil {
        return nil
    }

    result := make(map[string]string)
    for key, value := range v.(map[string]interface{}) {
        result[key] = value.(string)
    }
    return result
}

func waitForServerReady(ctx context.Context, client *Client, serverID string) error {
    for {
        server, err := client.GetServer(ctx, serverID)
        if err != nil {
            return err
        }

        if server.Status == "running" {
            return nil
        }

        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-time.After(10 * time.Second):
            // Continue polling
        }
    }
}

Build and Install Custom Provider:

# 1. Build the provider
cd terraform-provider-custom
go mod init terraform-provider-custom
go mod tidy
go build -o terraform-provider-custom

# 2. Install in Terraform plugins directory
mkdir -p ~/.terraform.d/plugins/registry.terraform.io/myorg/custom/1.0.0/linux_amd64/
mv terraform-provider-custom ~/.terraform.d/plugins/registry.terraform.io/myorg/custom/1.0.0/linux_amd64/

# 3. Use in Terraform configuration
cat > main.tf <<'EOF'
terraform {
  required_providers {
    custom = {
      source  = "myorg/custom"
      version = "1.0.0"
    }
  }
}

provider "custom" {
  api_key = var.custom_api_key
}

resource "custom_server" "web" {
  name   = "web-server-1"
  region = "us-east-1"
  size   = "medium"

  metadata = {
    owner = "devops-team"
    cost-center = "engineering"
  }
}

output "server_ip" {
  value = custom_server.web.ip_address
}
EOF

# 4. Initialize and use
terraform init
terraform apply


Scenario 11: Terraform Cloud/Enterprise Workflows

Collaborative infrastructure with VCS integration

sequenceDiagram
    participant Dev as Developer
    participant Git as GitHub
    participant TFC as Terraform Cloud
    participant Agent as TFC Agent
    participant AWS as AWS

    Dev->>Git: Push to feature branch
    Git->>TFC: Webhook triggers run
    TFC->>Agent: Queue plan

    Agent->>Git: Fetch configuration
    Agent->>AWS: terraform plan
    AWS->>Agent: Return plan details
    Agent->>TFC: Post plan results

    TFC->>Dev: Show plan in UI
    Dev->>TFC: Approve plan

    TFC->>Agent: Queue apply
    Agent->>AWS: terraform apply
    AWS->>Agent: Provision resources
    Agent->>TFC: Report completion

    TFC->>Git: Update commit status

    Note over TFC: Remote execution, audit logs

Code - Terraform Cloud Configuration:

# terraform.tf - Configure Terraform Cloud backend

terraform {
  cloud {
    organization = "my-org"

    workspaces {
      name = "app-prod"
      # Or use tags: tags = ["production", "aws"]
    }

    # Optional: Specify agent pool
    # agent_pool_id = "pool-12345"
  }

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

# Configure provider with variables from Terraform Cloud
provider "aws" {
  region     = var.aws_region
  access_key = var.aws_access_key
  secret_key = var.aws_secret_key

  default_tags {
    tags = {
      TerraformCloudWorkspace = terraform.workspace
      CostCenter             = var.cost_center
    }
  }
}

# Sentinel policies (Terraform Cloud)
# policies/sentinel/enforce-mandatory-labels.sentinel
import "tfplan/v2" as tfplan

mandatory_tags = ["Environment", "CostCenter", "ManagedBy"]

for tfplan.resources as _, instances {
  for instances as index, r {
    if r.mode == "managed" and r.type == "aws_instance" {
      # Check if tags are defined
      if not r.values.tags else {
        # Check for mandatory tags
        for mandatory_tags as tag {
          if r.values.tags[tag] else {
            print("Resource", r.address, "is missing mandatory tag:", tag)
            return false
          }
        }
      }
    }
  }
}

main = rule { true }

# cost-policy.sentinel
import "tfplan/v2" as tfplan
import "decimal"

total_monthly_cost = decimal.new(0)

for tfplan.resources as _, instances {
  for instances as _, r {
    if r.mode == "managed" and r.cost_estimate else {
      total_monthly_cost += r.cost_estimate.proposed_monthly_cost
    }
  }
}

max_budget = decimal.new(1000.00)  # $1000/month

main = rule {
  total_monthly_cost <= max_budget
}

Workspace Configuration (.terraformrc):

# ~/.terraformrc
credentials "app.terraform.io" {
  token = "YOUR_TERRAFORM_CLOUD_TOKEN"
}

# Agent configuration (for private infrastructure)
# ~/.tfc-agent.toml
id = "agent-01"
token = "AGENT_TOKEN_FROM_TFC"
# Enable auto-update
auto_update = true
# Log level
log_level = "info"
# Concurrency
concurrency = 5

GitHub Actions Integration:

# .github/workflows/terraform.yml
name: Terraform Cloud Run

on:
  push:
    branches:
      - main
  pull_request:

jobs:
  terraform:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.6.6
          cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}

      - name: Terraform Format Check
        run: terraform fmt -check -recursive

      - name: Terraform Init
        run: terraform init

      - name: Terraform Validate
        run: terraform validate

      - name: Create Plan Run
        if: github.event_name == 'pull_request'
        run: |
          terraform plan -no-color -out=tfplan \
            -var="db_password=${{ secrets.DB_PASSWORD }}"

      - name: Apply on Merge
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -auto-approve \
          -var="db_password=${{ secrets.DB_PASSWORD }}"

      - name: Comment PR
        uses: actions/github-script@v6
        if: github.event_name == 'pull_request'
        with:
          script: |
            const output = `#### Terraform Format and Style 🖌\`${{ steps.fmt.outcome }}\`
            #### Terraform Initialization ⚙️\`${{ steps.init.outcome }}\`
            #### Terraform Validation 🤖\`${{ steps.validate.outcome }}\`
            #### Terraform Plan 📖\`${{ steps.plan.outcome }}\`
            <details><summary>Show Plan</summary>
            \`\`\`terraform\n${{ steps.plan.outputs.stdout }}\n\`\`\`
            </details>
            *Pushed by: @${{ github.actor }}, Action: \`${{ github.event_name }}\`*`;

            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: output
            })

Terraform Cloud API Usage:

# Trigger run via API
curl \
  --header "Authorization: Bearer $TF_API_TOKEN" \
  --header "Content-Type: application/vnd.api+json" \
  --request POST \
  --data @- \
  https://app.terraform.io/api/v2/workspaces/ws-123456/runs <<'EOF'
{
  "data": {
    "type": "runs",
    "attributes": {
      "message": "Triggered via API",
      "auto-apply": false
    }
  }
}
EOF

# Get run status
curl \
  --header "Authorization: Bearer $TF_API_TOKEN" \
  https://app.terraform.io/api/v2/runs/run-123456

# Download state
curl \
  --header "Authorization: Bearer $TF_API_TOKEN" \
  --header "Content-Type: application/vnd.api+json" \
  https://app.terraform.io/api/v2/workspaces/ws-123456/current-state-version


Scenario 12: Testing and CI/CD Integration

Automated testing of Terraform configurations

sequenceDiagram
    participant Git as GitHub
    participant CI as CI Pipeline
    participant Lint as Terraform Lint
    participant Sec as Security Scan
    participant Cost as Cost Estimation
    participant Plan as Terraform Plan
    participant TFC as Terraform Cloud
    participant Apply as Terraform Apply

    Git->>CI: Pull request opened
    CI->>Lint: terraform fmt -check
    Lint->>CI: Pass/Fail

    CI->>Sec: tfsec + checkov
    Sec->>CI: Security report (PASS/WARN/FAIL)

    CI->>Cost: infracost breakdown
    Cost->>CI: Cost estimate ($150/month)

    CI->>Plan: terraform plan -out=tfplan
    Plan->>CI: Plan details (24 resources to add)

    CI->>Git: Post PR comment with results

    Git->>CI: PR approved & merged
    CI->>TFC: Trigger remote run

    TFC->>Apply: terraform apply tfplan
    Apply->>TFC: Apply complete

    TFC->>Git: Update commit status (success)

    Note over CI, TFC: Automated quality gates

Code: Testing and CI/CD Integration


# Security scanning with tfsec and Checkov
# .github/workflows/security-scan.yml
name: Security Scan

on:
  pull_request:
    paths:
      - '**/*.tf'

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Run tfsec
        uses: tfsec/tfsec-sarif-action@master
        with:
          sarif_file: tfsec.sarif

      - name: Upload SARIF file
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: tfsec.sarif

      - name: Run Checkov
        uses: bridgecrewio/checkov-action@master
        with:
          framework: terraform
          output_format: sarif
          output_file_path: checkov.sarif

      - name: Upload Checkov SARIF
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: checkov.sarif

Terraform test framework (new in 1.6+):

# tests/web_app.tftest.hcl
mock_provider "aws" {}

variables {
  environment = "test"
  region      = "us-east-1"
}

run "create_instance" {
  command = apply

  # Input variables for this test
  variables {
    instance_type = "t3.micro"
    instance_count = 1
  }

  # Assertions to verify behavior
  assert {
    condition     = length(aws_instance.web) == 1
    error_message = "Should create exactly 1 instance"
  }

  assert {
    condition     = aws_instance.web[0].instance_type == "t3.micro"
    error_message = "Instance should be t3.micro"
  }

  assert {
    condition     = can(regex(".+-test-.+", aws_instance.web[0].tags.Name))
    error_message = "Name tag should contain environment"
  }
}

run "enforce_security_group_rules" {
  command = apply

  variables {
    allowed_ports = [22, 443]
  }

  assert {
    condition     = length(aws_security_group.web.ingress) == 2
    error_message = "Security group should have 2 ingress rules"
  }

  assert {
    condition     = alltrue([
      for rule in aws_security_group.web.ingress : 
      contains([22, 443], rule.from_port)
    ])
    error_message = "Only ports 22 and 443 should be allowed"
  }
}

run "destroy_cleanly" {
  command = destroy

  # Verify resources are destroyed
  plan {
    mode = destroy
  }
}

Terratest example (Go-based testing):

// test/terraform_web_app_test.go
package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/aws"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestTerraformWebApp(t *testing.T) {
    t.Parallel()

    // Configure Terraform options
    terraformOptions := &terraform.Options{
        TerraformDir: "../examples/web-app",
        Vars: map[string]interface{}{
            "environment":   "test",
            "instance_type": "t3.micro",
            "instance_count": 1,
        },
        EnvVars: map[string]string{
            "AWS_DEFAULT_REGION": "us-east-1",
        },
    }

    // Cleanup resources at end of test
    defer terraform.Destroy(t, terraformOptions)

    // Initialize and apply
    terraform.InitAndApply(t, terraformOptions)

    // Get outputs
    instanceID := terraform.Output(t, terraformOptions, "instance_id")
    publicIP := terraform.Output(t, terraformOptions, "public_ip")

    // Verify EC2 is running
    instance := aws.GetInstance(t, "us-east-1", instanceID)
    assert.Equal(t, "running", *instance.State.Name)

    // Verify HTTP endpoint returns 200
    url := fmt.Sprintf("http://%s", publicIP)
    http_helper.HttpGetWithRetryWithCustomValidation(
        t,
        url,
        nil,
        30,
        10*time.Second,
        func(statusCode int, body string) bool {
            return statusCode == 200 && strings.Contains(body, "Hello")
        },
    )

    // Verify tags
    tags := aws.GetTagsForEc2Instance(t, "us-east-1", instanceID)
    assert.Equal(t, "test", tags["Environment"])
}

// Test performance under load
func TestWebAppScaling(t *testing.T) {
    terraformOptions := &terraform.Options{
        TerraformDir: "../examples/web-app",
        Vars: map[string]interface{}{
            "environment":    "load-test",
            "instance_count": 5,
        },
    }

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    // Run load test
    albDNS := terraform.Output(t, terraformOptions, "alb_dns_name")
    runLoadTest(t, albDNS, 1000, 30*time.Second)

    // Verify ASG scales up
    asgName := terraform.Output(t, terraformOptions, "autoscaling_group_name")
    desiredCapacity := aws.GetDesirecCapacityForAsg(t, "us-east-1", asgName)
    assert.GreaterOrEqual(t, desiredCapacity, 5)
}


Scenario 13: Secrets Management with Vault

Integrating HashiCorp Vault for dynamic secrets

sequenceDiagram
    participant TF as Terraform
    participant Vault as HashiCorp Vault
    participant AWS as AWS Resources
    participant App as Application
    participant Audit as Audit Log

    TF->>Vault: Request dynamic AWS credentials
    Vault->>AWS: Generate temporary IAM role
    Vault->>TF: Return short-lived credentials

    TF->>Vault: Request database credentials
    Vault->>AWS: Create temporary DB user
    Vault->>TF: Return rotating password

    TF->>AWS: Provision resources using Vault secrets
    AWS->>App: Pass secrets to application

    App->>Vault: Renew lease periodically
    Audit->>Vault: Log all secret access

    Note over Vault: Secrets automatically expire & rotate

Code:

# provider.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    vault = {
      source  = "hashicorp/vault"
      version = "~> 3.20"
    }
  }
}

# Configure Vault provider
provider "vault" {
  address = "https://vault.example.com:8200"

  # Use token from environment or IAM auth
  token = var.vault_token

  # Or use AWS IAM auth
  auth_login {
    path = "auth/aws/login"
    parameters = {
      role = "terraform-deployer"
      jwt  = var.vault_jwt
    }
  }
}

# Dynamic AWS credentials from Vault
data "vault_aws_access_credentials" "deploy" {
  backend = "aws"  # Mount path in Vault
  role    = "terraform-deploy-role"

  # Renew credentials before expiration
  renew = true
}

# Use dynamic credentials with AWS provider
provider "aws" {
  access_key = data.vault_aws_access_credentials.deploy.access_key
  secret_key = data.vault_aws_access_credentials.deploy.secret_key

  # STS token if using IAM roles
  token = data.vault_aws_access_credentials.deploy.security_token
  region = var.aws_region
}

# Database credentials
resource "vault_database_secret_backend_connection" "postgres" {
  backend       = "database"
  name          = "postgres-prod"
  allowed_roles = ["app-read", "app-write"]

  postgresql {
    connection_url = "postgres://${var.db_admin_user}:${var.db_admin_pass}@db.example.com:5432/prod"
  }
}

resource "vault_database_secret_backend_role" "app_read" {
  backend = vault_database_secret_backend_connection.postgres.backend
  name    = "app-read"
  db_name = vault_database_secret_backend_connection.postgres.name

  creation_statements = [
    "CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';",
    "GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";"
  ]

  default_ttl = "1h"
  max_ttl     = "24h"
}

# Generate dynamic credentials in Terraform
data "vault_generic_secret" "db_creds" {
  path = "database/creds/app-read"
}

# Use credentials to provision database
resource "postgresql_database" "app" {
  name  = "customer_app"
  owner = data.vault_generic_secret.db_creds.data["username"]
}

# PostgreSQL provider configuration
provider "postgresql" {
  host     = "db.example.com"
  port     = 5432
  database = "postgres"
  username = data.vault_generic_secret.db_creds.data["username"]
  password = data.vault_generic_secret.db_creds.data["password"]

  # Superuser for schema changes
  superuser = false
}

# Kubernetes secrets integration
data "vault_kv_secret_v2" "app_config" {
  mount = "kv"
  name  = "apps/webapp/config"
}

resource "kubernetes_secret" "app" {
  metadata {
    name = "app-secret"
  }

  data = {
    db_host     = data.vault_kv_secret_v2.app_config.data["db_host"]
    db_password = data.vault_kv_secret_v2.app_config.data["db_password"]
    api_key     = data.vault_kv_secret_v2.app_config.data["api_key"]
  }

  # This secret will be created with values from Vault
}

# Certificate management
resource "vault_pki_secret_backend_role" "app_cert" {
  backend          = "pki"
  name             = "app.example.com"
  ttl              = "720h"
  max_ttl          = "8760h"
  allow_ip_sans    = true
  allowed_domains  = ["app.example.com"]
  allow_subdomains = true
}

data "vault_pki_secret_backend_cert" "app" {
  backend  = vault_pki_secret_backend_role.app_cert.backend
  name     = vault_pki_secret_backend_role.app_cert.name
  common_name = "app.example.com"
  ttl      = "720h"
}

# Use Vault-managed certificate
resource "aws_acm_certificate" "app" {
  # Use Vault-generated certificate instead of AWS
  private_key      = data.vault_pki_secret_backend_cert.app.private_key_pem
  certificate_body = data.vault_pki_secret_backend_cert.app.certificate_pem
  certificate_chain = data.vault_pki_secret_backend_cert.app.issuing_ca_pem
}

# Encryption key
data "vault_transit_encrypt" "app" {
  backend = "transit"
  key     = "app-key"
  plaintext = base64encode(var.secret_data)
}

# Store encrypted value
resource "aws_ssm_parameter" "secret" {
  name  = "/prod/app/secret"
  type  = "SecureString"
  value = data.vault_transit_encrypt.app.ciphertext

  tags = {
    Source = "Vault-Transit"
  }
}

# Approle authentication (for CI/CD)
resource "vault_approle_auth_backend_role" "ci" {
  backend        = "approle"
  role_name      = "ci-terraform"
  token_policies = ["terraform-deployer"]
  token_ttl      = 300
  token_max_ttl  = 600
}

data "vault_approle_auth_backend_role_id" "ci" {
  backend   = vault_approle_auth_backend_role.ci.backend
  role_name = vault_approle_auth_backend_role.ci.role_name
}

resource "vault_approle_auth_backend_secret_id" "ci" {
  backend   = vault_approle_auth_backend_role.ci.backend
  role_name = vault_approle_auth_backend_role.ci.role_name
}

# Output CI credentials (sensitive!)
output "approle_role_id" {
  value     = data.vault_approle_auth_backend_role_id.ci.role_id
  sensitive = true
}

output "approle_secret_id" {
  value     = vault_approle_auth_backend_secret_id.ci.secret_id
  sensitive = true
}

Vault Policies for Terraform:

# vault-policies/terraform-deployer.hcl
path "aws/creds/deploy" {
  capabilities = ["read"]
}

path "database/creds/*" {
  capabilities = ["read"]
}

path "kv/data/apps/*" {
  capabilities = ["read"]
}

path "pki/issue/*" {
  capabilities = ["update"]
}

path "transit/encrypt/app-key" {
  capabilities = ["update"]
}

# For managing secrets
path "kv/data/apps/webapp/*" {
  capabilities = ["create", "read", "update", "delete"]
}

# For workspace-specific access
path "aws/role/{{terraform.workspace}}-deploy" {
  capabilities = ["read"]
}

Set up AppRole for CI/CD:

# Enable AppRole auth
vault auth enable approle

# Create policy
vault policy write terraform-deployer vault-policies/terraform-deployer.hcl

# Create role
vault write auth/approle/role/ci-terraform \
  secret_id_ttl=600 \
  token_ttl=300 \
  token_max_ttl=600 \
  token_policies=terraform-deployer

# Get credentials
vault read auth/approle/role/ci-terraform/role-id
vault write -f auth/approle/role/ci-terraform/secret-id

# Use in GitHub Actions
# Add these as secrets
export TF_VAR_vault_role_id=${{ secrets.VAULT_ROLE_ID }}
export TF_VAR_vault_secret_id=${{ secrets.VAULT_SECRET_ID }}

# Configure provider
provider "vault" {
  auth_login {
    path = "auth/approle/login"
    parameters = {
      role_id   = var.vault_role_id
      secret_id = var.vault_secret_id
    }
  }
}


Scenario 14: Policy as Code with OPA

Open Policy Agent for advanced policy enforcement

sequenceDiagram
    participant Dev as Developer
    participant Git as VCS
    participant OPA as Open Policy Agent
    participant TF as Terraform Plan (JSON)
    participant Policy as Rego Policies
    participant Result as Policy Decision

    Dev->>Git: Push Terraform code
    Git->>OPA: Webhook triggers policy check
    OPA->>TF: Parse plan file to JSON
    TF->>Policy: Evaluate against policies

    Policy->>Result: Check: No public S3 buckets
    Policy->>Result: Check: Cost under $1000
    Policy->>Result: Check: Required tags present

    Result->>OPA: Pass/Fail decision
    OPA->>Git: Block PR if failed

    Note over Policy: Declarative policy language

Code:

# Policy evaluation setup
# policies/enforce.rego
package terraform.policy

# Deny public S3 buckets
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket_public_access_block"
    resource.change.after.block_public_acls == false
    msg := sprintf("S3 bucket %s must block public ACLs", [resource.address])
}

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket_public_access_block"
    resource.change.after.block_public_policy == false
    msg := sprintf("S3 bucket %s must block public policies", [resource.address])
}

# Require encryption on all storage
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket"
    not resource.change.after.server_side_encryption_configuration
    msg := sprintf("S3 bucket %s must have encryption enabled", [resource.address])
}

# Enforce tagging
deny[msg] {
    resource := input.resource_changes[_]
    not startswith(resource.address, "data.")
    tags := resource.change.after.tags
    not tags.Environment
    msg := sprintf("Resource %s must have Environment tag", [resource.address])
}

# Cost control
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    instance_type := resource.change.after.instance_type

    # No instances larger than t3.xlarge in dev/staging
    input.workspace.name != "prod"
    not startswith(instance_type, "t3.")

    msg := sprintf("Instance %s type %s too large for non-prod", [
        resource.address, instance_type
    ])
}

# Allowed regions only
deny[msg] {
    resource := input.resource_changes[_]
    resource.provider_name == "registry.terraform.io/hashicorp/aws"

    region := input.configuration.provider_config.aws.expressions.region.constant_value

    not region in var.allowed_regions

    msg := sprintf("Region %s not allowed. Use one of: %s", [
        region, concat(", ", var.allowed_regions)
    ])
}

# No hardcoded credentials
deny[msg] {
    resource := input.resource_changes[_]
    password_field := [["password"], ["access_key"], ["secret_key"], ["token"]]

    some field in password_field
    resource.change.after[field]

    msg := sprintf("Resource %s has hardcoded credential field: %s", [
        resource.address, field
    ])
}

# Network security: No 0.0.0.0/0 in SG
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_security_group_rule"

    resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
    resource.change.after.type == "ingress"

    msg := sprintf("Security group %s cannot allow 0.0.0.0/0", [resource.address])
}

# Variables
allowed_regions = ["us-east-1", "us-west-2", "eu-west-1"]

# Test file (policies/test.rego)
package terraform.policy

test_deny_public_s3_bucket {
    input := {
        "resource_changes": [{
            "address": "aws_s3_bucket_public_access_block.example",
            "type": "aws_s3_bucket_public_access_block",
            "change": {
                "after": {"block_public_acls": false}
            }
        }]
    }

    count(deny) == 1
}

test_allow_encrypted_s3_bucket {
    input := {
        "resource_changes": [{
            "address": "aws_s3_bucket.example",
            "type": "aws_s3_bucket",
            "change": {
                "after": {
                    "server_side_encryption_configuration": {
                        "rule": {
                            "apply_server_side_encryption_by_default": {
                                "sse_algorithm": "AES256"
                            }
                        }
                    }
                }
            }
        }]
    }

    count(deny) == 0
}

CI/CD Integration:

#!/bin/bash
# .github/scripts/opa-evaluate.sh

# Generate Terraform plan JSON
terraform show -json tfplan.binary > tfplan.json

# Download OPA
curl -L -o opa https://github.com/open-policy-agent/opa/releases/latest/download/opa_linux_amd64_static
chmod +x opa

# Run policy checks
opa test policies/

# Evaluate against Terraform plan
opa eval --format pretty \
  --data policies/enforce.rego \
  --input tfplan.json \
  "data.terraform.policy.deny"

# Check if any denials
if [ $? -ne 0 ]; then
  echo "OPA policy violations detected!"
  exit 1
fi

# For PR comments
opa eval --format json \
  --data policies/enforce.rego \
  --input tfplan.json \
  "data.terraform.policy.deny" > opa-results.json

# Parse and comment on PR
python3 .github/scripts/comment-pr.py opa-results.json

Conftest integration:

# Alternative to OPA CLI
conftest test tfplan.json --policy policies/

# With specific namespace
conftest test tfplan.json --policy policies/ --namespace terraform.policy

# Output in TAP format
conftest test tfplan.json --output tap


Scenario 15: Dynamic Provider Configuration

Multi-account, multi-region deployments

sequenceDiagram
    participant Config as Configuration
    participant TF as Terraform
    participant Alias as Provider Aliases
    participant AWS1 as AWS Account 1
    participant AWS2 as AWS Account 2

    Config->>TF: Define provider configs
    TF->>Alias: Create 3 AWS provider aliases
    Alias->>AWS1: Provider "aws.dev"
    Alias->>AWS1: Provider "aws.staging"
    Alias->>AWS2: Provider "aws.prod"

    TF->>TF: for_each = environments
    TF->>Alias: Use different provider per env

    TF->>AWS1: Deploy to dev & staging
    TF->>AWS2: Deploy to prod

    Note over Alias: Dynamic provider selection

Code:

# Configure multiple AWS providers
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

# Default provider (for resources without alias)
provider "aws" {
  region = "us-east-1"

  assume_role {
    role_arn = "arn:aws:iam::${var.accounts.dev}:role/TerraformRole"
  }
}

# Provider for staging
provider "aws" {
  alias  = "staging"
  region = "us-west-2"

  assume_role {
    role_arn = "arn:aws:iam::${var.accounts.staging}:role/TerraformRole"
  }
}

# Provider for production (different account)
provider "aws" {
  alias  = "prod"
  region = "us-east-1"

  assume_role {
    role_arn     = "arn:aws:iam::${var.accounts.prod}:role/TerraformRole"
    session_name = "terraform-prod"
    external_id  = var.prod_external_id
  }

  # Different credentials profile
  profile = "prod-admin"
}

# Variable for account IDs
variable "accounts" {
  type = object({
    dev     = string
    staging = string
    prod    = string
  })

  default = {
    dev     = "123456789012"
    staging = "123456789013"
    prod    = "123456789014"
  }
}

# Deploy to all environments using for_each
locals {
  environments = toset(["dev", "staging", "prod"])
}

# Map environment to provider alias
locals {
  env_provider = {
    dev     = aws
    staging = aws.staging
    prod    = aws.prod
  }
}

# Create resources in each environment
resource "aws_instance" "web" {
  for_each = local.environments

  # Select provider based on environment
  provider = local.env_provider[each.key]

  ami           = data.aws_ami.amazon_linux[each.key].id
  instance_type = var.instance_types[each.key]

  tags = {
    Environment = each.key
    Name        = "web-server-${each.key}"
  }
}

# Data sources per provider
data "aws_ami" "amazon_linux" {
  for_each = local.environments

  provider = local.env_provider[each.key]

  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

# Conditional provider usage
resource "aws_s3_bucket" "logs" {
  # Only in prod account
  provider = aws.prod
  count    = var.environment == "prod" ? 1 : 0

  bucket = "prod-logs-${data.aws_caller_identity.prod.account_id}"
}

# Cross-account resource reference
data "aws_caller_identity" "prod" {
  provider = aws.prod
}

# Shared KMS key (in security account)
provider "aws" {
  alias  = "security"
  region = "us-east-1"

  assume_role {
    role_arn = "arn:aws:iam::${var.accounts.security}:role/SecurityAdmin"
  }
}

resource "aws_kms_key" "shared" {
  provider                = aws.security
  description             = "Shared encryption key"
  deletion_window_in_days = 7
  enable_key_rotation     = true

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "EnableIAMUserPermissions"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::${var.accounts.security}:root"
        }
        Action   = "kms:*"
        Resource = "*"
      },
      {
        Sid    = "AllowProdAccountUse"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::${var.accounts.prod}:root"
        }
        Action = [
          "kms:Encrypt",
          "kms:Decrypt",
          "kms:ReEncrypt*",
          "kms:GenerateDataKey*",
          "kms:DescribeKey"
        ]
        Resource = "*"
      }
    ]
  })
}

# Use shared KMS key from production
resource "aws_s3_bucket_server_side_encryption_configuration" "prod_data" {
  provider = aws.prod
  bucket   = aws_s3_bucket.prod_data.id

  rule {
    apply_server_side_encryption_by_default {
      kms_master_key_id = aws_kms_key.shared.key_id
      sse_algorithm     = "aws:kms"
    }
  }
}

# Multi-region deployment
variable "regions" {
  type = map(object({
    provider = string
    region   = string
  }))

  default = {
    us-east-1 = {
      provider = "aws"
      region   = "us-east-1"
    }
    eu-west-1 = {
      provider = "aws.eu"
      region   = "eu-west-1"
    }
    ap-south-1 = {
      provider = "aws.apac"
      region   = "ap-south-1"
    }
  }
}

# Configure regional providers
provider "aws" {
  alias  = "eu"
  region = "eu-west-1"
}

provider "aws" {
  alias  = "apac"
  region = "ap-south-1"
}

# Deploy to all regions
resource "aws_s3_bucket" "global_assets" {
  for_each = var.regions

  provider = aws[each.value.provider]
  bucket   = "global-assets-${each.key}-${data.aws_caller_identity.current.account_id}"

  provider = lookup({
    "us-east-1" = aws
    "eu-west-1" = aws.eu
    "ap-south-1" = aws.apac
  }, each.key)

  tags = {
    Region = each.key
  }
}

# Provider meta-arguments
resource "aws_instance" "web" {
  provider = aws.staging

  # Explicit provider for each resource
  lifecycle {
    # Provider changes force recreation
    create_before_destroy = true
  }
}


Here are the missing Mermaid sequence diagrams for Scenarios 16-20:


Scenario 16: Performance Optimization at Scale

Speeding up plans and applies for large infrastructures

sequenceDiagram
    participant Dev as Developer
    participant TF as Terraform Core
    participant Graph as Dependency Graph
    participant Cache as Build Cache
    participant State as State File
    participant AWS as AWS API

    Dev->>TF: terraform apply -parallelism=20
    TF->>Graph: Build resource dependency graph
    Graph->>TF: Identify parallelizable resources

    TF->>Cache: Check for cached providers/modules
    Cache->>TF: Return cached data

    TF->>State: Read current state
    State->>TF: Return state (partial refresh)

    TF->>AWS: Parallel API calls (20 concurrent)
    AWS->>TF: Return resource statuses

    TF->>AWS: Create/Update resources in parallel batches
    AWS->>TF: Confirm operations

    TF->>Dev: Show progress with reduced time

    Note over TF: Optimized execution with -target & -refresh=false

Code:

# Parallel resource creation
resource "aws_instance" "worker" {
  for_each = { for i in range(var.worker_count) : i => i }

  # Parallel provisioning
  provisioner "remote-exec" {
    inline = [
      "sudo yum update -y",
      "sudo systemctl start worker"
    ]

    connection {
      type        = "ssh"
      user        = "ec2-user"
      private_key = file(var.private_key_path)
      host        = self.public_ip
      timeout     = "2m"
    }
  }
}

# Graph dependencies optimization
# Explicit depends_on for complex dependencies
resource "aws_db_instance" "app" {
  # Wait for subnet group
  depends_on = [aws_db_subnet_group.app]

  # Create after security group
  depends_on = [aws_security_group.rds]

  # Don't start until network is ready
  depends_on = [module.vpc]
}

# Use -parallelism flag (default 10)
# terraform apply -parallelism=20

# Refresh only specific resources
terraform refresh -target=aws_instance.web

# State manipulation for speed
# Move resources instead of recreating
terraform state mv aws_instance.old[0] aws_instance.new[0]

# Remove unused resources from state (faster than destroy)
terraform state rm aws_instance.unused

# Data source caching
data "aws_ami" "amazon_linux" {
  # Use cache if available
  lifecycle {
    postcondition {
      condition     = self.id != ""
      error_message = "Failed to fetch AMI"
    }
  }
}

# Limit provider calls with lifecycle
resource "aws_s3_bucket" "logs" {
  # Prevent recreation on tag changes
  lifecycle {
    ignore_changes = [
      tags["LastUpdated"],
      server_side_encryption_configuration[0].rule[0].apply_server_side_encryption_by_default[0].kms_master_key_id
    ]
  }
}

# Use -refresh=false for faster plans
terraform plan -refresh=false

# Refresh specific resources only
terraform plan -target=aws_instance.web

# Split monolithic state
# backend.tf
terraform {
  backend "s3" {
    bucket = "app-state-${terraform.workspace}"
    key    = "infrastructure.tfstate"
  }
}

# Use data sources to reference other states
data "terraform_remote_state" "network" {
  backend = "s3"
  config = {
    bucket = "network-state-${terraform.workspace}"
    key    = "network.tfstate"
  }
}

# Disable unnecessary providers
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    # Only enable Vault in prod
    vault = {
      source  = "hashicorp/vault"
      version = "~> 3.0"
      configuration_aliases = [vault.prod]
    }
  }
}

provider "vault" {
  alias = "prod"
  # Only configured for prod
}

# Conditional provider usage
resource "vault_generic_secret" "prod_secret" {
  count    = terraform.workspace == "prod" ? 1 : 0
  provider = vault.prod

  path = "secret/prod/app"
}


Scenario 17: Cross-Region Replication

Global infrastructure patterns

sequenceDiagram
    participant TF as Terraform
    participant Primary as Primary Region (us-east-1)
    participant Dr as DR Region (us-west-2)
    participant Replicate as Replication Service
    participant App as Application
    participant DNS as Global DNS

    TF->>Primary: Create DynamoDB table
    Primary->>Replicate: Enable global tables

    TF->>Dr: Create replica table
    Replicate->>Dr: Sync data continuously

    TF->>Primary: Launch RDS cluster
    Primary->>Dr: Create cross-region replica

    TF->>DNS: Configure Route53 failover
    App->>DNS: Query app.example.com

    alt Primary Healthy
        DNS->>Primary: Route traffic to ALB
    else Primary Down
        DNS->>Dr: Failover to DR region
    end

    Note over Replicate: Async replication with <1s lag

Code:

# Global table with DynamoDB
resource "aws_dynamodb_table" "global" {
  name           = "global-data"
  billing_mode   = "PAY_PER_REQUEST"
  hash_key       = "id"

  attribute {
    name = "id"
    type = "S"
  }

  # Enable global tables
  replicas {
    region_name = "us-west-2"
  }

  replicas {
    region_name = "eu-west-1"
  }
}

# Cross-region VPC peering
resource "aws_vpc_peering_connection" "east_to_west" {
  vpc_id      = aws_vpc.east.id
  peer_vpc_id = aws_vpc.west.id
  peer_region = "us-west-2"

  auto_accept = false

  tags = {
    Name = "east-west-peering"
  }
}

# Accept peering in west region
resource "aws_vpc_peering_connection_accepter" "west_accepter" {
  provider = aws.west

  vpc_peering_connection_id = aws_vpc_peering_connection.east_to_west.id
  auto_accept               = true

  tags = {
    Name = "west-accepter"
  }
}

# Route tables for peering
resource "aws_route" "east_to_west" {
  count                     = length(aws_subnet.east_private)
  route_table_id            = aws_subnet.east_private[count.index].route_table_id
  destination_cidr_block    = aws_vpc.west.cidr_block
  vpc_peering_connection_id = aws_vpc_peering_connection.east_to_west.id
}

# Aurora Global Database
resource "aws_rds_global_cluster" "global" {
  global_cluster_identifier = "prod-global-db"
  engine                    = "aurora-postgresql"
  engine_version            = "15.3"
  database_name             = "globaldb"
}

resource "aws_rds_cluster" "primary" {
  provider                  = aws.primary
  engine                    = "aurora-postgresql"
  engine_version            = "15.3"
  cluster_identifier        = "prod-primary"
  master_username           = var.db_username
  master_password           = var.db_password
  global_cluster_identifier = aws_rds_global_cluster.global.id

  db_subnet_group_name = aws_db_subnet_group.primary.name
}

resource "aws_rds_cluster_instance" "primary" {
  provider           = aws.primary
  cluster_identifier = aws_rds_cluster.primary.id
  instance_class     = "db.r5.large"
}

# Secondary region
resource "aws_rds_cluster" "secondary" {
  provider                  = aws.secondary
  engine                    = "aurora-postgresql"
  engine_version            = "15.3"
  cluster_identifier        = "prod-secondary"
  global_cluster_identifier = aws_rds_global_cluster.global.id

  db_subnet_group_name = aws_db_subnet_group.secondary.name

  # Copy from primary
  source_region = "us-east-1"
}

# CloudFront global distribution
resource "aws_cloudfront_distribution" "global" {
  origin {
    domain_name = aws_lb.primary.dns_name
    origin_id   = "primary"

    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
  }

  origin {
    domain_name = aws_lb.dr.dns_name
    origin_id   = "dr"

    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
  }

  enabled             = true
  is_ipv6_enabled     = true
  default_root_object = "index.html"

  # Primary origin with DR backup
  origin_group {
    origin_id = "group"

    failover_criteria {
      status_codes = [403, 404, 500, 502, 503, 504]
    }

    member {
      origin_id = "primary"
    }

    member {
      origin_id = "dr"
    }
  }

  default_cache_behavior {
    target_origin_id = "group"

    viewer_protocol_policy = "redirect-to-https"
    allowed_methods        = ["GET", "HEAD", "OPTIONS", "PUT", "POST", "PATCH", "DELETE"]
    cached_methods         = ["GET", "HEAD"]

    forwarded_values {
      query_string = true
      cookies {
        forward = "all"
      }
    }
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  viewer_certificate {
    acm_certificate_arn = aws_acm_certificate.global.arn
    ssl_support_method  = "sni-only"
  }
}


Scenario 18: Edge Cases & Troubleshooting

Common issues and solutions

sequenceDiagram
    participant Dev as Developer
    participant TF as Terraform CLI
    participant State as State Lock
    participant API as AWS API
    participant Log as Debug Log
    participant Fix as Resolution

    Dev->>TF: terraform apply
    TF->>State: Request state lock

    alt State Locked
        State->>TF: Error: State locked by another process
        TF->>Dev: Show lock ID
        Dev->>State: Investigate lock holder
        Dev->>TF: terraform force-unlock <ID>
    else API Timeout
        API->>TF: Request timeout (30s)
        TF->>Log: Log timeout error
        Log->>Dev: Show error details
        Dev->>TF: Increase timeout/retry
        TF->>API: Retry with backoff
    else Resource Exists
        API->>TF: Error: Resource already exists
        TF->>Dev: Suggest import command
        Dev->>TF: terraform import <resource> <ID>
        TF->>State: Import successful
    end

    TF->>Dev: Apply successful

    Note over Log: Enable TF_LOG=DEBUG for details

Code:

# Handle resource already exists
resource "aws_s3_bucket" "example" {
  bucket = var.bucket_name

  # Import existing bucket instead of failing
  lifecycle {
    prevent_destroy = false
  }
}
# Then run: terraform import aws_s3_bucket.example my-existing-bucket

# Handle circular dependencies
# Use depends_on carefully
resource "aws_instance" "a" {
  depends_on = [aws_instance.b] # Don't do this - creates cycle

  # Instead use data sources
  user_data = <<-EOF
    #!/bin/bash
    echo "Instance B IP: ${aws_instance.b.private_ip}"
  EOF
}

# Handle provider version conflicts
# .terraform.lock.hcl
# Commit this file to lock provider versions

# Handle large state files
terraform state pull > state.json
# Edit state.json
terraform state push state.json

# Handle timeout issues
resource "aws_db_instance" "large" {
  # Increase timeout for large DB
  timeouts {
    create = "2h"
    delete = "2h"
    update = "2h"
  }
}

# Handle count/for_each errors
# Use locals to preprocess data
locals {
  healthy_instances = {
    for k, v in var.instances : k => v if v.status == "healthy"
  }
}

resource "aws_instance" "web" {
  for_each = local.healthy_instances

  # This avoids errors from invalid instances
}

# Handle provider authentication issues
# Use explicit credentials
provider "aws" {
  # Instead of relying on env vars, be explicit
  access_key = var.aws_access_key
  secret_key = var.aws_secret_key
  region     = var.aws_region

  # For SSO
  profile = "prod-admin"
  shared_config_files = [~/.aws/config"]
  shared_credentials_files = ["~/.aws/credentials"]
}

# Handle resource drift
resource "aws_security_group_rule" "example" {
  # Add lifecycle ignore for externally managed rules
  lifecycle {
    ignore_changes = all
  }
}

# Or detect drift
terraform plan -detailed-exitcode
# Exit code 0 = no changes
# Exit code 1 = error
# Exit code 2 = changes detected

# Handle large resources with pagination
data "aws_instances" "all" {
  # This may timeout for large accounts
  # Instead use filters
  filter {
    name   = "instance-state-name"
    values = ["running"]
  }

  filter {
    name   = "tag:Environment"
    values = ["production"]
  }
}

# Handle eventual consistency
resource "aws_iam_role_policy" "example" {
  role = aws_iam_role.example.name

  # Wait for role to propagate
  depends_on = [time_sleep.iam_propagation]
}

resource "time_sleep" "iam_propagation {
  depends_on = [aws_iam_role.example]

  create_duration = "30s"
}

# Handle API rate limiting
provider "aws" {
  # Add delays between requests
  max_retries = 10
  retry_mode  = "adaptive"

  # Custom endpoints for debugging
  endpoints {
    ec2 = "http://localhost:4566"  # LocalStack
  }
}

# Handle state locks
# Force unlock (use with caution!)
terraform force-unlock LOCK_ID

# Prevent lock issues
terraform {
  backend "s3" {
    # Use shorter lock timeout
    dynamodb_table = "terraform-locks"
    # Auto-apply doesn't hold locks long
  }
}

# Handle sensitive values
variable "db_password" {
  type      = string
  sensitive = true

  validation {
    condition     = length(var.db_password) >= 16
    error_message = "Password must be at least 16 characters."
  }
}

# Output with sensitive flag
output "db_endpoint" {
  value       = aws_db_instance.app.endpoint
  description = "Database endpoint"
  sensitive   = true
}

# Suppress sensitive value warnings
terraform 'output -json' | jq '.db_endpoint.value'

# Handle destroy errors
# Prevent destroy for critical resources
resource "aws_dynamodb_table" "critical" {
  lifecycle {
    prevent_destroy = true
  }
}

# Or use targeted destroy
terraform destroy -target=aws_instance.web

# Handle module version conflicts
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"  # Pin exact version

  # Use version constraints
  # '~> 5.0' means >= 5.0.0, < 6.0.0
}

# Handle complex expressions
# Use local values to simplify
locals {
  # Complex logic here
  instance_type = (
    var.environment == "prod" ? "m5.large" :
    var.environment == "staging" ? "m5.medium" :
    "t3.micro"
  )
}

resource "aws_instance" "web" {
  instance_type = local.instance_type
}

# Debugging tips
# Enable detailed logging
export TF_LOG=DEBUG
export TF_LOG_PATH=terraform.log

# Trace provider calls
TF_LOG=TRACE terraform plan

# Graph dependencies
terraform graph | dot -Tpng > graph.png

# Validate JSON syntax
terraform show -json | jq .

# Check provider schemas
terraform providers schema -json

# Performance profiling
terraform plan -profile=cpu.prof
go tool pprof cpu.prof

# Resource tracing
TF_LOG_PROVIDER=TRACE terraform apply

# State directory cleanup
# Remove unused providers
terraform providers lock -platform=linux_amd64 -platform=darwin_amd64

# Compact state file
terraform state pull | jq -S . > compact.json
terraform state push compact.json


Scenario 19: Final Production Checklist

Pre-deployment validation script

sequenceDiagram
    participant Dev as Developer
    participant Script as Validation Script
    participant TF as Terraform CLI
    participant Linters as Linters/Scanners
    participant API as Cloud APIs
    participant Result as Final Report

    Dev->>Script: Run pre-flight-check.sh
    Script->>TF: terraform version

    TF->>TF: terraform fmt -check
    TF->>TF: terraform validate

    Script->>Linters: Run tfsec, checkov
    Linters->>Script: Security scan results

    Script->>API: Check state lock table
    API->>Script: Lock status

    Script->>Script: Verify variables & tfvars

    Result->>Dev: Print pass/fail report

    alt All Checks Pass
        Dev->>TF: terraform apply
    else Checks Fail
        Dev->>Script: Review errors & fix
    end

    Note over Script: Pre-deployment quality gate

Code:

#!/bin/bash
# pre-flight-check.sh

set -e

echo "Running Terraform pre-flight checks..."

# Check version
terraform version
if ! terraform version | grep -q "1.6"; then
  echo "WARNING: Terraform version should be 1.6.x"
fi

# Format check
if ! terraform fmt -check -recursive; then
  echo "ERROR: Terraform files not formatted. Run 'terraform fmt -recursive'"
  exit 1
fi

# Validate
terraform validate

# Security scan
tfsec .

# Cost estimation
infracost breakdown --path .

# Check for hardcoded secrets
if grep -r "password\|secret\|key" --include="*.tf" . | grep -v "variable\|data\|local"; then
  echo "WARNING: Potential hardcoded secrets found"
fi

# Check state lock
aws dynamodb describe-table --table-name terraform-locks --region us-east-1

# Validate variables
if [ ! -f "terraform.tfvars" ]; then
  echo "WARNING: No terraform.tfvars file found"
fi

# Check provider locks
if [ ! -f ".terraform.lock.hcl" ]; then
  echo "WARNING: Provider lock file missing. Run 'terraform init -upgrade'"
fi

# Check for sensitive outputs
terraform output -json | jq -r 'to_entries[] | select(.value.sensitive == false) | .key'

echo "All checks passed!"
echo "Ready for: terraform apply"

Quick Reference: Essential Commands

Command Description Level
terraform init Initialize the working directory Beginner
terraform plan Show changes to be applied Beginner
terraform apply Apply the changes Beginner
terraform destroy Destroy the infrastructure Beginner
terraform validate Check if configuration is valid Beginner
terraform fmt Format configuration files Beginner
terraform output Show output values Beginner
terraform state list List resources in state Intermediate
terraform state show <resource> Show resource details Intermediate
terraform state rm <resource> Remove resource from state Intermediate
terraform state mv <old> <new> Move resource in state Intermediate
terraform import <resource> <id> Import existing resource Intermediate
terraform taint <resource> Mark resource for recreation Intermediate
terraform untaint <resource> Remove taint from resource Intermediate
terraform providers List providers Advanced
terraform providers lock Lock provider versions Advanced
terraform providers unlock Unlock provider versions Advanced
terraform workspace list List workspaces Advanced
terraform workspace new <name> Create new workspace Advanced
terraform workspace select <name> Select a workspace Advanced
terraform workspace delete <name> Delete a workspace Advanced

Pro Tips for All Levels

  1. Always use version control: Keep your Terraform configurations in a Git repository.
  2. Use modules: Organize your infrastructure into reusable modules.
  3. Define variables: Use variables for flexibility and reusability.
  4. Set resource limits: Define timeouts and retry policies for resources.
  5. Use data sources: Fetch existing resources instead of recreating them.
  6. Check dependencies: Use terraform graph to visualize dependencies.
  7. Validate configurations: Regularly run terraform validate and terraform fmt.
  8. Use remote state: Store state in a remote backend like S3 or Terraform Cloud.
  9. Lock state files: Prevent concurrent modifications with state locks.
  10. Monitor costs: Use cost estimation tools like Infracost.
  11. Automate testing: Integrate Terraform with CI/CD pipelines and testing frameworks.
  12. Use Sentinel policies: Enforce compliance and best practices with Terraform Cloud.
  13. Keep Terraform updated: Stay on supported versions and regularly update providers.
  14. Use workspaces: Manage multiple environments with workspaces.
  15. Use remote execution: Leverage Terraform Cloud or Enterprise for remote runs.
  16. Backup state files: Regularly back up your state files to prevent data loss.

Happy infrastructure management! 🧱