G
GuideDevOps
Lesson 5 of 14

Resources & Data Sources

Part of the Terraform tutorial series.

Terraform Resources

Resources are the most important building blocks in Terraform. They describe infrastructure objects that Terraform will create, update, or delete. Every infrastructure component in Terraform is a resource.

What is a Resource?

A resource represents a real-world infrastructure component:

  • Virtual Machines (AWS EC2, Azure VM, Google Compute)
  • Storage (AWS S3, Azure Blob, Google Cloud Storage)
  • Networking (VPCs, subnets, security groups, load balancers)
  • Databases (RDS, Cosmos DB, Cloud SQL)
  • Container Services (ECS, AKS, GKE)
  • DNS (Route53, Azure DNS, Cloud DNS)
  • IAM (Users, roles, policies)

Resource Syntax

resource "aws_instance" "web_server" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t2.micro"
  
  tags = {
    Name = "WebServer"
    Environment = "dev"
  }
}

Components:

  • Type: aws_instance (provider + resource type)
  • Local Name: web_server (reference in code)
  • Arguments: Configuration properties (ami, instance_type, tags)

Reference in other resources:

# Reference the instance
resource "aws_security_group" "web" {
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}
 
resource "aws_instance" "web" {
  # ...
  vpc_security_group_ids = [aws_security_group.web.id]
}

Common AWS Resources

ResourcePurposeExample
aws_instanceEC2 virtual machineWeb server, app server
aws_vpcVirtual networkNetwork isolation
aws_subnetNetwork segmentPublic/private subnets
aws_security_groupFirewall rulesPort/protocol access
aws_s3_bucketObject storageStatic files, backups
aws_rds_instanceManaged databaseMySQL, PostgreSQL
aws_iam_roleAccess roleService permissions
aws_lambda_functionServerless functionEvent handlers
aws_api_gateway_rest_apiAPI endpointREST API gateway
aws_internet_gatewayInternet accessRoute to internet

Arguments, Attributes, and Meta-Arguments

Arguments (inputs):

resource "aws_instance" "web" {
  ami                    = "ami-123456"  # Argument
  instance_type          = "t2.micro"    # Argument
  associate_public_ip_address = true     # Argument
}

Attributes (outputs from state):

resource "aws_instance" "web" {
  ami           = "ami-123456"
  instance_type = "t2.micro"
}
 
# Access attributes
output "instance_ip" {
  value = aws_instance.web.public_ip  # Attribute
}

Meta-Arguments (control resource behavior):

resource "aws_instance" "servers" {
  count             = 3           # Create 3 instances
  ami               = "ami-123456"
  instance_type     = "t2.micro"
 
  depends_on        = [aws_security_group.web]  # Explicit dependency
  provider          = aws.us_west_2             # Specific provider
}

Resource Dependencies

Implicit Dependencies:

resource "aws_instance" "web" {
  # Automatically depends on security group
  vpc_security_group_ids = [aws_security_group.web.id]
}

Explicit Dependencies:

resource "aws_instance" "web" {
  ami           = "ami-123456"
  instance_type = "t2.micro"
  
  # Must create security group first
  depends_on = [aws_security_group.web]
}
 
resource "aws_security_group" "web" {
  vpc_id = "vpc-123456"
}

Resource Lifecycle

resource "aws_instance" "web" {
  ami           = "ami-123456"
  instance_type = "t2.micro"
  
  lifecycle {
    create_before_destroy = true    # New before destroying old
    ignore_changes = [tags]         # Don't track tag changes
    replace_triggered_by = [aws_security_group.web] # Replace if SG changes
  }
}

Options:

  • create_before_destroy: Create new before destroying old
  • prevent_destroy: Error if trying to destroy
  • ignore_changes: Don't update if these change
  • replace_triggered_by: Recreate if specific resource changes

Data Sources

Data sources fetch information about existing infrastructure that was created outside Terraform or looked up from external systems. They are read-only.

Data Source Syntax

data "aws_ami" "ubuntu" {
  most_recent = true
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }
  
  owners = ["099720109477"]  # Canonical
}
 
# Use the data
resource "aws_instance" "web" {
  ami = data.aws_ami.ubuntu.id
  instance_type = "t2.micro"
}

Common Data Sources

Data SourcePurposeExample
aws_amiFind AMI imageLatest Ubuntu, Amazon Linux
aws_availability_zonesList AZs in regionMulti-AZ setup
aws_vpcGet existing VPCReference existing VPC
aws_subnetsList subnetsNetwork setup
aws_security_groupGet security groupExisting firewall rules
aws_rds_clusterFetch RDS infoDatabase endpoints
aws_kubernetes_clusterGet cluster detailsAPI endpoint
aws_caller_identityCurrent AWS accountAccount ID, ARN
httpFetch HTTP contentExternal APIs
local_fileRead local filesSSH keys, configs

Finding Available Data

Documentation:

  • Registry: registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/

List with CLI:

terraform providers schema -json | jq '.provider_schemas."registry.terraform.io/hashicorp/aws".data_source_schemas | keys'

Building Fetch Filters

# Get latest Ubuntu AMI
data "aws_ami" "ubuntu" {
  most_recent = true
  
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-*"]
  }
  
  filter {
    name   = "root-device-type"
    values = ["ebs"]
  }
  
  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
  
  owners = ["099720109477"]
}

Using Data Sources with Resources

Pattern: Fetch data, then use in resource

# Step 1: Find VPC by tag
data "aws_vpc" "main" {
  filter {
    name   = "tag:Name"
    values = ["production"]
  }
}
 
# Step 2: Use VPC in security group
resource "aws_security_group" "web" {
  vpc_id = data.aws_vpc.main.id
  name   = "web-sg"
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}
 
# Step 3: Use security group in instance
resource "aws_instance" "web" {
  ami                    = data.aws_ami.ubuntu.id
  instance_type          = "t2.micro"
  vpc_security_group_ids = [aws_security_group.web.id]
}

Local Data Source

Read local files:

data "local_file" "ssh_key" {
  filename = "${path.module}/ssh/id_rsa.pub"
}
 
resource "aws_key_pair" "default" {
  key_name   = "my-key"
  public_key = data.local_file.ssh_key.content
}

HTTP Data Source

Fetch external data:

data "http" "cloud_ips" {
  url = "https://api.github.com/meta"
}
 
locals {
  github_hooks_ips = jsondecode(data.http.cloud_ips.response_body).hooks
}

Data vs Resources

AspectResourceData Source
Creates infrastructureYesNo
Read from stateWritingReading
Destroyed on destroyYesNo
Used forCreating objectsQuerying info
DirectionManaged by TFRead-only
Exampleaws_instanceaws_ami

Locals vs Data Sources

# Locals: Calculate/define values
locals {
  common_tags = {
    Environment = "prod"
    Team        = "platform"
  }
}
 
# Data sources: Fetch existing infrastructure
data "aws_availability_zones" "available" {
  state = "available"
}
 
# Use both
resource "aws_instance" "web" {
  availability_zone = data.aws_availability_zones.available.names[0]
  
  tags = merge(
    local.common_tags,
    { Name = "web-server" }
  )
}

Best Practices

  1. Use data sources for existing infrastructure — Don't hardcode IDs
  2. Organize resources logically — Group by type or component
  3. Use explicit dependencies — When implicit ones miss order
  4. Document complex resources — Explain why certain arguments are set
  5. Use lifecycle rules — Prevent accidental destruction in production
  6. Fetch latest versions — Use most_recent in data sources when appropriate
  7. Reference, don't copy — Use resource references instead of hardcoding
  8. Validate data source results — Check if data exists before using