Skip to main content
Menu
Home WhoAmI Stack Insights Blog Contact
/user/KayD @ karandeepsingh.ca :~$ cat terraform-getting-started-aws.md

Terraform From Scratch: Provision AWS Infrastructure Step by Step

Karandeep Singh
• 16 minutes read

Summary

Learn Terraform with AWS step by step. Start with one S3 bucket, hit real errors, fix them, then build a full VPC + EC2 setup. Each step shows the command, the output, and the mistake.

Every AWS project starts the same way: click around the console, create some resources, forget what you created, then panic when the bill arrives. Terraform fixes this. You describe your infrastructure in files, run one command, and AWS builds exactly what you described. Run another command and it all gets destroyed. No leftover resources, no mystery bills.

We’ll start with the absolute minimum, one S3 bucket, and work up to a VPC with an EC2 instance you can SSH into. Each step adds exactly one concept. We’ll hit real errors along the way and fix them.

What We’re Building

By the end of this article you will have:

  • An S3 bucket created and destroyed with Terraform
  • A VPC with a public subnet and internet access
  • An EC2 instance inside that VPC you can SSH into
  • Variables and outputs to keep your config clean

The journey:

  1. Install Terraform and verify AWS credentials
  2. Create an S3 bucket (your first resource)
  3. Break the state file and learn why it matters
  4. Destroy everything with one command
  5. Create a VPC (just the network, nothing else)
  6. Try to launch an EC2 instance (and fail because there’s no internet)
  7. Fix it with an internet gateway and route table
  8. Add SSH access and actually connect
  9. Extract variables so nothing is hardcoded
  10. Clean up

Prerequisites

  • AWS CLI configured (aws sts get-caller-identity should work)
  • An AWS account with permissions for S3, VPC, and EC2
  • A terminal (Linux, macOS, or WSL on Windows)

Step 1: Install Terraform

What: Download the Terraform binary and put it in your PATH.

Why: Terraform is a single binary with no dependencies. No package managers, no runtimes, no frameworks.

Download from the official releases page. On Linux or macOS:

curl -LO https://releases.hashicorp.com/terraform/1.7.5/terraform_1.7.5_linux_amd64.zip
unzip terraform_1.7.5_linux_amd64.zip
sudo mv terraform /usr/local/bin/

Verify:

terraform version
Terraform v1.7.5
on linux_amd64

Terraform reads your AWS credentials from the same place the AWS CLI does: environment variables, ~/.aws/credentials, or an instance profile. If aws sts get-caller-identity works, Terraform will too. No extra configuration needed.

Step 2: Your First Resource (an S3 Bucket)

What: Create a single S3 bucket from a 6-line config file.

Why: This is the Terraform “hello world.” One provider block, one resource block. Nothing else.

Create a project directory:

mkdir terraform-aws-lab && cd terraform-aws-lab

main.tf

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "my_bucket" {
  bucket = "my-terraform-test-bucket-12345"
}

That’s it. Two blocks: a provider tells Terraform which cloud to talk to, and a resource tells it what to create. The "my_bucket" part is a local name so we can reference this resource later.

Now the three commands you’ll run hundreds of times:

terraform init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of hashicorp/aws...
- Installing hashicorp/aws v5.88.0...
Terraform has been successfully initialized!

init downloads the AWS provider plugin. You run it once per project.

terraform plan
Terraform will perform the following actions:

  # aws_s3_bucket.my_bucket will be created
  + resource "aws_s3_bucket" "my_bucket" {
      + bucket = "my-terraform-test-bucket-12345"
      + id     = (known after apply)
      + arn    = (known after apply)
      ...
    }

Plan: 1 to add, 0 to change, 0 to destroy.

plan shows what will happen without doing anything. The + means “will be created.” Read this output every time. It is the diff between what exists in AWS and what your files describe.

terraform apply
Do you want to perform these actions?
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_s3_bucket.my_bucket: Creating...
aws_s3_bucket.my_bucket: Creation complete after 2s [id=my-terraform-test-bucket-12345]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Verify it’s real:

aws s3 ls | grep terraform
2026-02-16 10:05:12 my-terraform-test-bucket-12345

You just created AWS infrastructure from a text file. Six lines of config, three commands, one bucket.

Step 3: Break the State File (On Purpose)

What: Delete the state file and watch Terraform lose its memory.

Why: Understanding the state file is the single most important Terraform concept. If you skip this, you will lose production infrastructure someday.

Run terraform apply again without changing anything:

terraform apply
No changes. Your infrastructure matches the configuration.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Terraform compared your main.tf to the state file (terraform.tfstate) that it wrote during the first apply. They match, so it did nothing. Good.

Now delete the state file and try again:

rm terraform.tfstate terraform.tfstate.backup
terraform apply
aws_s3_bucket.my_bucket: Creating...

│ Error: creating Amazon S3 (Simple Storage) Bucket (my-terraform-test-bucket-12345):
│   BucketAlreadyOwnedByYou: Your previous request to create the named bucket
│   succeeded and you already own it.

What happened: Without the state file, Terraform has no idea the bucket exists. It tries to create it from scratch, and AWS says “you already have that one.”

The fix: Import the existing resource back into state:

terraform import aws_s3_bucket.my_bucket my-terraform-test-bucket-12345
aws_s3_bucket.my_bucket: Importing from ID "my-terraform-test-bucket-12345"...
Import successful!

Now terraform plan shows no changes again. Terraform’s memory is restored.

Step 4: Destroy Everything

What: Delete the S3 bucket with one command.

Why: This is the other half of Terraform’s value. Creating resources is easy. Cleaning them up without leaving orphans is where most people fail.

terraform destroy
Terraform will perform the following actions:

  # aws_s3_bucket.my_bucket will be destroyed
  - resource "aws_s3_bucket" "my_bucket" {
      - bucket = "my-terraform-test-bucket-12345" -> null
      ...
    }

Plan: 0 to add, 0 to change, 1 to destroy.

  Enter a value: yes

aws_s3_bucket.my_bucket: Destroying...
aws_s3_bucket.my_bucket: Destruction complete after 1s

Destroy complete! Resources: 1 destroyed.

Verify:

aws s3 ls | grep terraform

Nothing. The bucket is gone. No orphaned resources, no mystery charges next month.

Step 5: Create a VPC (Just the Network)

What: Create a VPC with a subnet. Nothing else yet.

Why: Before launching any EC2 instance, you need a network. A VPC is your own private slice of AWS. We’ll build this incrementally so you can see what each piece does.

Replace your entire main.tf:

main.tf

provider "aws" {
  region = "us-east-1"
}

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true

  tags = {
    Name = "terraform-lab-vpc"
  }
}

resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  map_public_ip_on_launch = true
  availability_zone       = "us-east-1a"

  tags = {
    Name = "terraform-lab-public"
  }
}

Two things to notice. The subnet references the VPC with aws_vpc.main.id. Terraform reads this reference, figures out it needs to create the VPC first, and builds the dependency graph automatically. You never tell Terraform what order to create things.

The map_public_ip_on_launch = true means any instance launched in this subnet gets a public IP. We’ll need that later for SSH.

Initialize (new provider download needed since we destroyed the old state) and apply:

terraform init
terraform apply
Plan: 2 to add, 0 to change, 0 to destroy.

aws_vpc.main: Creating...
aws_vpc.main: Creation complete after 2s [id=vpc-0a1b2c3d4e5f6g7h8]
aws_subnet.public: Creating...
aws_subnet.public: Creation complete after 1s [id=subnet-0a1b2c3d4e]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Two resources. The VPC exists, the subnet exists inside it. But there’s a problem we can’t see yet. This subnet has no route to the internet. Any instance we launch here will have a public IP that doesn’t actually work.

Let’s prove it.

Step 6: Launch an EC2 Instance (and Fail)

What: Add an EC2 instance to the VPC and try to reach it.

Why: This is the mistake almost everyone makes. You create a VPC, launch an instance, and then wonder why you can’t connect. The answer is always: no internet gateway, no route table, or no security group.

Add this to the bottom of main.tf:

data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-*-x86_64"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.amazon_linux.id
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.public.id

  tags = {
    Name = "terraform-lab-instance"
  }
}

The data block is not a resource. It’s a lookup. Terraform queries AWS for the most recent Amazon Linux 2023 AMI and uses its ID. This way your config doesn’t contain a hardcoded AMI ID that becomes outdated.

Apply:

terraform apply
Plan: 1 to add, 0 to change, 0 to destroy.

aws_instance.web: Creating...
aws_instance.web: Still creating... [10s elapsed]
aws_instance.web: Creation complete after 13s [id=i-0abc123def456]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

The instance is running. Let’s grab its public IP and try to ping it:

terraform show | grep public_ip
public_ip = "54.123.45.67"
ping -c 3 54.123.45.67
PING 54.123.45.67 (54.123.45.67): 56 data bytes
--- 54.123.45.67 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss

100% packet loss. The instance has a public IP but no way to reach the internet. Two things are missing:

  1. An internet gateway attached to the VPC
  2. A route table that sends 0.0.0.0/0 traffic to that gateway

And even if those existed, there’s no security group allowing inbound traffic. We have three problems to fix, one at a time.

Step 7: Fix Networking (Internet Gateway + Route Table)

What: Add an internet gateway and a route table so traffic can flow in and out.

Why: A VPC is isolated by default. That’s the entire point. You have to explicitly punch a hole to the internet with a gateway and tell the subnet to use it via a route table.

Add these three resources to main.tf, right after the subnet block:

resource "aws_internet_gateway" "gw" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "terraform-lab-igw"
  }
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.gw.id
  }

  tags = {
    Name = "terraform-lab-public-rt"
  }
}

resource "aws_route_table_association" "public" {
  subnet_id      = aws_subnet.public.id
  route_table_id = aws_route_table.public.id
}

The internet gateway attaches to the VPC. The route table says “anything headed for 0.0.0.0/0 (the internet) should go through the gateway.” The association links this route table to our subnet.

Apply:

terraform apply
Plan: 3 to add, 0 to change, 0 to destroy.

aws_internet_gateway.gw: Creating...
aws_internet_gateway.gw: Creation complete after 1s
aws_route_table.public: Creating...
aws_route_table.public: Creation complete after 1s
aws_route_table_association.public: Creating...
aws_route_table_association.public: Creation complete after 0s

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

Notice something: Terraform only created the 3 new resources. It didn’t touch the VPC, subnet, or EC2 instance. It knows those already exist and nothing about them changed.

But ping still won’t work. The network path is now open, but the instance has no security group allowing inbound ICMP or SSH. On to the next fix.

Step 8: Add SSH Access (Security Group + Key Pair)

What: Create a security group that allows SSH from your IP, then rebuild the instance with it.

Why: AWS blocks all inbound traffic by default. You need a security group rule that explicitly allows port 22 from your IP address.

First, find your public IP:

curl -s https://checkip.amazonaws.com
203.0.113.10

Add a security group to main.tf:

resource "aws_security_group" "ssh" {
  name        = "terraform-lab-ssh"
  description = "Allow SSH from my IP"
  vpc_id      = aws_vpc.main.id

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["203.0.113.10/32"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "terraform-lab-ssh"
  }
}

The /32 means “this exact IP, nobody else.” Never use 0.0.0.0/0 for SSH. Bots scan for open port 22 constantly.

Now update the EC2 instance to use this security group and add a key pair. Change the aws_instance block:

main.tf (updated instance block)

resource "aws_instance" "web" {
  ami                    = data.aws_ami.amazon_linux.id
  instance_type          = "t3.micro"
  subnet_id              = aws_subnet.public.id
  vpc_security_group_ids = [aws_security_group.ssh.id]
  key_name               = "my-key"

  tags = {
    Name = "terraform-lab-instance"
  }
}

Two changes: vpc_security_group_ids attaches the security group, and key_name references an existing EC2 key pair for SSH. You need to have this key pair already created in AWS. If you don’t have one:

aws ec2 create-key-pair --key-name my-key --query 'KeyMaterial' --output text > ~/.ssh/my-key.pem
chmod 400 ~/.ssh/my-key.pem

Apply:

terraform apply
Plan: 1 to add, 1 to change, 0 to destroy.

  # aws_security_group.ssh will be created
  + resource "aws_security_group" "ssh" { ... }

  # aws_instance.web will be updated in-place
  ~ resource "aws_instance" "web" {
      ~ vpc_security_group_ids = [ ... ]
      + key_name               = "my-key"
      ...
    }

Read the plan carefully. The security group gets created (+). The instance gets updated in-place (~), not destroyed and recreated. Terraform is smart enough to know that changing a security group doesn’t require a new instance. But the key pair change will force a replacement:

  # aws_instance.web must be replaced
-/+ resource "aws_instance" "web" {
      ~ key_name = "" -> "my-key"   # forces replacement
      ...
    }

Plan: 2 to add, 0 to change, 1 to destroy.

There it is: must be replaced. Adding a key pair to an existing instance requires destroying and recreating it. Terraform tells you this before doing anything. This is why you always read the plan.

Type yes. After it finishes:

terraform show | grep public_ip
public_ip = "54.198.22.33"

The IP changed because the instance was recreated. Now SSH:

ssh -i ~/.ssh/my-key.pem ec2-user@54.198.22.33
   ,     #_
   ~\_  ####_        Amazon Linux 2023
  ~~  \_#####\
  ~~     \###|
  ~~       \#/ ___   https://aws.amazon.com/linux/amazon-linux-2023
   ~~       V~' '->
    ~~~         /
      ~~._.   _/
         _/ _/
       _/m/'
[ec2-user@ip-10-0-1-42 ~]$

You’re in. A VPC, subnet, internet gateway, route table, security group, and EC2 instance, all from one text file.

Step 9: Extract Variables (Stop Hardcoding)

What: Move hardcoded values into a variables file.

Why: Right now your IP address and key pair name are hardcoded in main.tf. That’s fine for one person experimenting, but the moment someone else needs to use this config, or you need to deploy to a different region, everything breaks.

Create a new file:

variables.tf

variable "aws_region" {
  description = "AWS region to deploy into"
  type        = string
  default     = "us-east-1"
}

variable "my_ip" {
  description = "Your public IP in CIDR notation (e.g. 203.0.113.10/32)"
  type        = string
}

variable "key_pair_name" {
  description = "Name of an existing EC2 key pair for SSH access"
  type        = string
}

Now update main.tf to use these variables. Change the provider:

provider "aws" {
  region = var.aws_region
}

Change the security group ingress CIDR:

  cidr_blocks = [var.my_ip]

Change the key name on the instance:

  key_name = var.key_pair_name

Create a values file so you don’t have to type them every time:

terraform.tfvars

my_ip         = "203.0.113.10/32"
key_pair_name = "my-key"

Terraform reads terraform.tfvars automatically. And create one more file for outputs:

outputs.tf

output "instance_public_ip" {
  description = "Public IP of the EC2 instance"
  value       = aws_instance.web.public_ip
}

output "ssh_command" {
  description = "SSH command to connect"
  value       = "ssh -i ~/.ssh/${var.key_pair_name}.pem ec2-user@${aws_instance.web.public_ip}"
}

output "vpc_id" {
  description = "ID of the VPC"
  value       = aws_vpc.main.id
}

Apply to verify nothing changed:

terraform apply
No changes. Your infrastructure matches the configuration.

Outputs:

instance_public_ip = "54.198.22.33"
ssh_command = "ssh -i ~/.ssh/my-key.pem ec2-user@54.198.22.33"
vpc_id = "vpc-0a1b2c3d4e5f6g7h8"

No changes. We refactored the config without touching the infrastructure. The outputs now print useful information after every apply.

Four files, four concerns: main.tf (resources), variables.tf (inputs), outputs.tf (results), terraform.tfvars (your values). This is the standard Terraform project layout you’ll use for every project.

Step 10: Clean Up

terraform destroy
Plan: 0 to add, 0 to change, 7 to destroy.

Do you really want to destroy all resources?
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_route_table_association.public: Destroying...
aws_instance.web: Destroying...
...
aws_vpc.main: Destruction complete after 1s

Destroy complete! Resources: 7 destroyed.

Seven resources created and destroyed cleanly. Terraform figured out the reverse dependency order: delete the instance first, then the route table association, then the gateway, then the subnet, then the VPC. No orphans, no mystery charges.

What We Built

Starting from a 6-line config that created one S3 bucket, we incrementally built:

  1. S3 bucket: the Terraform “hello world.” init, plan, apply
  2. State file lesson: deleted it on purpose, watched Terraform lose its memory, recovered with terraform import
  3. VPC + subnet: a private network with no internet access
  4. EC2 instance that couldn’t be reached: proved that a VPC is isolated by default
  5. Internet gateway + route table: opened the path to the internet
  6. Security group + key pair: allowed SSH from one IP. Learned that changing key_name forces instance replacement
  7. Variables and outputs: stopped hardcoding, standard project layout

Three of those steps were deliberate mistakes. That’s not an accident. The mistakes are how you learn what each piece actually does.

Common Mistakes

1. Forgetting to destroy. Terraform resources cost money. If you’re experimenting, always terraform destroy when you’re done. Set a calendar reminder if you have to.

2. Committing terraform.tfstate to Git. The state file contains resource IDs and sometimes sensitive data. Add it to .gitignore. For teams, use remote state with S3 and DynamoDB locking.

3. Opening SSH to 0.0.0.0/0. We used var.my_ip to lock SSH to one IP. Bots scan for open port 22 within minutes of an instance launching.

4. Not reading the plan. terraform plan is your safety net. If it says “1 to destroy” and you expected “0 to destroy,” stop and figure out why before typing yes.

5. Hardcoding everything. Your first config will have hardcoded regions, IPs, and instance types. The moment you need a second environment, all of it breaks. Use variables from the start.

What’s Next

This covers single-file projects with local state. In real projects, you’ll add:

  • Remote state with an S3 backend and DynamoDB locking for team collaboration
  • Modules to package and reuse infrastructure patterns (VPC module, EC2 module)
  • Workspaces or directory structures for dev/staging/production
  • terraform fmt and terraform validate in your CI pipeline

If you’re already using AWS services like S3, Lambda, DynamoDB, or SQS, Terraform lets you provision the underlying infrastructure from code instead of clicking through the console.

Cheat Sheet

Copy-paste reference for Terraform + AWS.

Initialize a project:

terraform init

Preview changes:

terraform plan

Apply changes:

terraform apply

Destroy everything:

terraform destroy

Format your code:

terraform fmt

Validate syntax:

terraform validate

Import existing resource:

terraform import aws_s3_bucket.my_bucket my-bucket-name

Show current state:

terraform show

List resources in state:

terraform state list

Key rules to remember:

  • terraform init downloads providers. Run it once per project or after changing providers
  • Always run terraform plan before terraform apply. Read the diff
  • terraform.tfstate is Terraform’s memory. Lose it and Terraform forgets what it created
  • Resources reference each other by type and name: aws_vpc.main.id
  • data sources read existing resources without creating them (like the AMI lookup)
  • + means create, ~ means update in-place, -/+ means destroy and recreate. Read these symbols
  • Variables go in variables.tf, outputs in outputs.tf, values in terraform.tfvars
  • Add terraform.tfstate, terraform.tfstate.backup, and .terraform/ to .gitignore
  • Tags cost nothing and save hours of debugging. Tag everything

References and Further Reading

Keep Reading

Question

What's the first piece of AWS infrastructure you'd put under Terraform control? The thing you're most tired of clicking through the console to set up?

Contents