Terraform Fundamentals - Understanding the anatomy of a project

Learn about the structure of a Terraform project and various blocks that make up the configuration.

Apr 26, 2025

As we continue the Terraform journey, it's important to understand the actual structure of a Terraform configuration. If you’ve followed the previous post, you’ve already worked with a basic main.tf. Now it's time to take a step back and understand what makes up a typical Terraform file, why its structure matters, and how to write clean, reusable infrastructure code.

Terraform project structure

Terraform configurations are written in HashiCorp Configuration Language (HCL), which is readable, declarative, and block-based.

Follow my journey of 100 Days of Red Team on WhatsApp, Telegram or Discord.

At its simplest, a Terraform project consists of files with the .tf extension. These files can be broken up by purpose (e.g., variables.tf, outputs.tf, main.tf), or written entirely inside a single file for small projects.

A minimal, well-organized directory might look like:

terraform-hello-world/
├── main.tf          # Core infrastructure resources
├── variables.tf     # Inputs like instance type or region
├── outputs.tf       # Outputs like instance IP
├── terraform.tfvars # Default values for variables (optional)

Terraform loads all .tf files in a directory and merges them internally. Organizing code into separate files isn't mandatory, but it makes the configuration much easier to read and maintain—especially as the infrastructure grows.

The core building blocks

Terraform configurations are made up of blocks—self-contained sections that declare intent using the HashiCorp Configuration Language. Let’s break down the most important block types:

Provider Block

The provider block tells Terraform which cloud provider or API to use. We’ve already used the AWS provider in the earlier post. Here's a typical example:

provider "aws" {
  region = "us-east-1"
}

This block configures the AWS plugin and tells Terraform which region to target. As a best practice, we can also read the region value from a variable, as shown below.

provider "aws" {
  region = var.aws_region
}

Resource Block

Resources are the heart of infrastructure provisioning via Terraform. Each resource block describes a piece of infrastructure to be created, like an EC2 instance or S3 bucket. The structure looks like this:

resource "<provider>_<type>" "<name>" {
  # configuration
}

Following is the resource block we used in the configuration in the last post:

resource "aws_instance" "hello" {
  ami           = "ami-0e449927258d45bc4" # Amazon Linux 2 AMI in us-east-1
  instance_type = "t2.micro"

  tags = {
    Name = "TerraformHelloWorld"
  }
}

We can further modify this resource block to read values from variables instead of hard coding them and also include SSH configuration to enable us to login to the EC2 instance:

resource "aws_instance" "hello" {
  ami           = var.ami_id
  instance_type = var.instance_type

  user_data = <<-EOF
    #!/bin/bash
    echo 'ec2-user:YourSecurePassword123' | chpasswd
    sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
    systemctl restart sshd
  EOF

  vpc_security_group_ids = [aws_security_group.allow_ssh.id]

  tags = {
    Name = "TerraformHelloWorld"
  }
}

This block creates a simple EC2 instance with password authentication configured using user_data. While password access isn't secure for production, I am using it here for demonstration purposes only.

Variable Block

Variables allow us to parameterize configuration, making it more flexible and reusable. We define variables using the variable block, optionally providing a default value and description. Below I have defined variable blocks for aws_region, ami_id and instance_type variables:

variable "aws_region" {
  description = "AWS region to deploy resources"
  default     = "us-east-1"
}

variable "ami_id" {
  description = "The AMI to use for EC2"
  default = "ami-0e449927258d45bc4" # Amazon Linux 2 AMI in us-east-1
}

variable "instance_type" {
  description = "EC2 instance type"
  default     = "t2.micro"
}

Terraform will prompt for values that aren't provided via defaults, CLI flags, or tfvars files. These variables can be overridden using the CLI or terraform.tfvars file.

Output Block

After Terraform provisions resources, we can use output blocks to extract useful information like public IP addresses, resource IDs, or login details.

output "public_ip" {
  description = "Public IP of the EC2 instance"
  value       = aws_instance.hello.public_ip
}

These outputs show up at the end of terraform apply. This makes it easy to retrieve values for later use, such as connecting to an instance via SSH.

Data Block

A data block in Terraform is used to fetch information about existing infrastructure. This is extremely useful when dealing with resources already provided by AWS, like the default VPC.

Let’s say we want our EC2 instance to launch inside the default VPC. Instead of manually finding and pasting the VPC ID (which can change across accounts or regions), we can let Terraform find it:

data "aws_vpc" "default" {
  default = true
}

This tells Terraform:

“Get the default VPC in the current region, and make its properties available to me.”

We can then reference it later using:
data.aws_vpc.default.id

Another use case is getting our public IP so we can configure the security group to only allow our IP. We can use the HTTP data source like this:

data "http" "my_ip" {
  url = "https://checkip.amazonaws.com/"
}

Terraform will fetch the IP and store it in data.http.my_ip.response_body.

Locals Block

A locals block defines reusable values or expressions in Terraform. Think of it like defining variables for intermediate logic that don’t need to be exposed outside the project.

In our example, once we have the public IP from the data.http.my_ip block, we’ll want to format it properly as a CIDR block (x.x.x.x/32) for the security group. We use a local block for that:

locals {
  my_ip_cidr = "${trim(data.http.my_ip.response_body, "\n")}/32"
}

This:

Trims any whitespace or newline characters from the IP response
Appends /32 to restrict access to only your specific IP

Now we can use local.my_ip_cidr anywhere in the configuration—making it more readable.

With a better understanding of the structure of a Terraform configuration, let’s first update the configuration from the previous post to provision SSH access via credentials and also output the public IP of the EC2 instance.

Following permissions need to be added to the TerraformEC2Access IAM policy before using the updated configuration:
ec2:DescribeVpcAttribute
ec2:CreateSecurityGroup
ec2:RevokeSecurityGroupEgress
ec2:DescribeNetworkInterfaces
ec2:DeleteSecurityGroup
ec2:RevokeSecurityGroupEgress
ec2:AuthorizeSecurityGroupIngress
ec2:AuthorizeSecurityGroupEgress

variable "aws_region" {
  description = "AWS region to deploy resources"
  default     = "us-east-1"
}

variable "ami_id" {
  description = "The AMI to use for EC2"
  default = "ami-0e449927258d45bc4" # Amazon Linux 2 AMI in us-east-1
}

variable "instance_type" {
  description = "EC2 instance type"
  default     = "t2.micro"
}

provider "aws" {
  region = var.aws_region
}

# Fetch default VPC
data "aws_vpc" "default" {
  default = true
}

# Get your public IP address
data "http" "my_ip" {
  url = "https://checkip.amazonaws.com/"
}

# Create a local variable to hold the IP in CIDR format
locals {
  my_ip_cidr = "${trim(data.http.my_ip.response_body, "\n")}/32"
}

resource "aws_instance" "hello" {
  ami           = var.ami_id
  instance_type = var.instance_type

  user_data = <<-EOF
    #!/bin/bash
    echo 'ec2-user:YourSecurePassword123' | chpasswd
    sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
    systemctl restart sshd
  EOF

  vpc_security_group_ids = [aws_security_group.allow_ssh.id]

  tags = {
    Name = "TerraformHelloWorld"
  }
}

resource "aws_security_group" "allow_ssh" {
  name        = "allow_ssh"
  description = "Allow SSH from my IP"
  vpc_id      = data.aws_vpc.default.id

  ingress {
    description = "SSH"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [local.my_ip_cidr]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

output "public_ip" {
  description = "Public IP of the EC2 instance"
  value       = aws_instance.hello.public_ip
}

Once the provisioning is complete, you can login to the newly created EC2 instance by using the IP address output by terraform apply command and credentials, ec2-user:YourSecurePassword123.

ssh -l ec2-user <IP Address of EC2 Instance>

Now, lets break-up the above configuration in to the file structure as described earlier. You can also find the split version in 100 Days of Red Team GitHub repository.

variables.tf

variable "aws_region" {
  description = "AWS region to deploy resources"
}

variable "ami_id" {
  description = "The AMI to use for EC2"
}

variable "instance_type" {
  description = "EC2 instance type"
}

terraform.tfvars

aws_region = "us-east-1"
ami_id = "ami-0e449927258d45bc4" # Amazon Linux 2 AMI in us-east-1
instance_type = "t2.micro"

main.tf

provider "aws" {
  region = var.aws_region
}

# Fetch default VPC
data "aws_vpc" "default" {
  default = true
}

# Get your public IP address
data "http" "my_ip" {
  url = "https://checkip.amazonaws.com/"
}

# Create a local variable to hold the IP in CIDR format
locals {
  my_ip_cidr = "${trim(data.http.my_ip.response_body, "\n")}/32"
}

resource "aws_instance" "hello" {
  ami           = var.ami_id
  instance_type = var.instance_type

  user_data = <<-EOF
    #!/bin/bash
    echo 'ec2-user:YourSecurePassword123' | chpasswd
    sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
    systemctl restart sshd
  EOF

  vpc_security_group_ids = [aws_security_group.allow_ssh.id]

  tags = {
    Name = "TerraformHelloWorld"
  }
}

resource "aws_security_group" "allow_ssh" {
  name        = "allow_ssh"
  description = "Allow SSH from my IP"
  vpc_id      = data.aws_vpc.default.id

  ingress {
    description = "SSH"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [local.my_ip_cidr]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

outputs.tf

output "public_ip" {
  description = "Public IP of the EC2 instance"
  value       = aws_instance.hello.public_ip
}

Best Practices for Writing Terraform Files

Use variables instead of hardcoding values.
Output values that help you interact with resources (like IPs or passwords).
Split files for clarity, especially in larger projects.
Comment your configuration with #, //, or /* */.
Avoid using real secrets directly—use environment variables or secret managers for production.
Stick to a consistent naming convention for files and variables.

TL;DR
This post covered the core structure of a Terraform configuration,.tf files, key block types like provider, resource, variable, output, locals, and data.