Use Terraformer to import existing infrastructure resources into Terraform

My name is Teraoka, and I am an infrastructure engineer.

In the previous article, I introduced the terraform import command as a way to import existing resources into Terraform

How to import existing infrastructure resources with Terraform

As mentioned in the summary of this article,
the import command only rewrites tfstate, so
you have to write the tf files yourself while checking the differences with tfstate.
If there are a lot of files, this will take an incredibly long time, which is a problem.

I have good news for you all:
a tool called terraformer has been released as open source software.

https://github.com/GoogleCloudPlatform/terraformer

CLI tool to generate terraform files from existing infrastructure (reverse Terraform). Infrastructure to Code

As described above, it appears to be a CLI tool that automatically generates Terraform files from existing infrastructure.
Instructions for use are also available on Github, so let's try it out.

install

On Mac, you can install it using the brew command

$ brew install terraformer $ terraformer version Terraformer v0.8.7

So far so simple.
We'll use v0.8.7.

Infrastructure Configuration

I created the infrastructure to be imported in advance using terraformer

https://github.com/beyond-teraoka/terraform-aws-multi-environment-sample

Configuration diagram

I have three environments in the same AWS account

  1. develop
  2. production
  3. manage

Additionally, each environment has the following resources:

  1. VPC
  2. Subnet
  3. Route Table
  4. Internet Gateway
  5. NAT Gateway
  6. Security Group
  7. VPC Peering
  8. EIP
  9. EC2
  10. ALB
  11. RDS

Although not shown in the diagram, the resources for each environment are tagged with an Environment tag,
with values ​​set to dev, prod, and mng, respectively.

Prepare authentication information

Prepare your AWS credentials.
Please prepare them according to your environment.

$ cat /Users/yuki.teraoka/.aws/credentials [beyond-poc] aws_access_key_id = XXXXXXXXXXXXXXXXXXXX aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [beyond-poc-admin] role_arn = arn:aws:iam::XXXXXXXXXXXX:role/XXXXXXXXXXXXXXXXXXXXXX source_profile = beyond-poc

Running terraformer

First, try running it as per the example on GitHub

$ terraformer import aws --resources=alb,ec2_instance,eip,ebs,igw,nat,rds,route_table,sg,subnet,vpc,vpc_peering --regions=ap-northeast-1 --profile=beyond-poc-admin 2020/05/05 18:29:14 aws importing region ap-northeast-1 2020/05/05 18:29:14 aws importing... vpc 2020/05/05 18:29:15 open /Users/yuki.teraoka/.terraform.d/plugins/darwin_amd64: no such file or directory

It seems that the plugin directory is required when you init terraform.
Prepare init.tf and init it.

$ echo 'provider "aws" {}' > init.tf $ terraform init

I'll try running it again

$ terraformer import aws --resources=alb,ec2_instance,eip,ebs,igw,nat,rds,route_table,sg,subnet,vpc,vpc_peering --regions=ap-northeast-1 --profile=beyond-poc-admin

It looks like the import was successful.
A directory called generated has been created, which wasn't there before.

Directory Structure

$ tree . └── aws ├── alb │ ├── lb.tf │ ├── lb_listener.tf │ ├── lb_target_group.tf │ ├── lb_target_group_attachment.tf │ ├── outputs.tf │ ├── provider.tf │ ├── terraform.tfstate │ └── variables.tf ├── ebs │ ├── ebs_volume.tf │ ├── outputs.tf │ ├── provider.tf │ └── terraform.tfstate ├── ec2_instance │ ├── instance.tf │ ├── outputs.tf │ ├── provider.tf │ ├── terraform.tfstate │ └── variables.tf ├── eip │ ├── eip.tf │ ├── outputs.tf │ ├── provider.tf │ └── terraform.tfstate ├── igw │ ├── internet_gateway.tf │ ├── outputs.tf │ ├── provider.tf │ ├── terraform.tfstate │ └── variables.tf ├── nat │ ├── nat_gateway.tf │ ├── outputs.tf │ ├── provider.tf │ └── terraform.tfstate ├── rds │ ├── db_instance.tf │ ├── db_parameter_group.tf │ ├── db_subnet_group.tf │ ├── outputs.tf │ ├── provider.tf │ ├── terraform.tfstate │ └── variables.tf ├── route_table │ ├── main_route_table_association.tf │ ├── outputs.tf │ ├── provider.tf │ ├── route_table.tf │ ├── route_table_association.tf │ ├── terraform.tfstate │ └── variables.tf ├── sg │ ├── outputs.tf │ ├── provider.tf │ ├── security_group.tf │ ├── security_group_rule.tf │ ├── terraform.tfstate │ └── variables.tf ├── subnet │ ├── outputs.tf │ ├── provider.tf │ ├── subnet.tf │ ├── terraform.tfstate │ └── variables.tf ├── vpc │ ├── outputs.tf │ ├── provider.tf │ ├── terraform.tfstate │ └── vpc.tf └── vpc_peering ├── outputs.tf ├── provider.tf ├── terraform.tfstate └── vpc_peering_connection.tf 13 directories, 63 files

Please pay attention to the directory structure.
Terraformer imports in the default structure "{output}/{provider}/{service}/{resource}.tf".
This is also documented on GitHub.

Terraformer by default separates each resource into a file, which is put into a given service directory.

The default path for resource files is {output}/{provider}/{service}/{resource}.tf and can vary for each provider.

This structure presents the following problems:

  1. Because tfstate is split for each Terraform resource, even small changes require multiple Applys
  2. All environment resources are recorded in the same tfstate, so changes in one environment can affect all environments

Ideally, I would like to split tfstate into environments such as develop, production, and manage, so that
all resources for each environment are recorded in the same tfstate.
I investigated whether this is possible and found the following:

  1. You can explicitly specify the hierarchy using the --path-pattern option
  2. You can import only resources that have the tag specified in the --filter option

It seems possible to achieve this by combining these two, so let's give it a try

$ terraformer import aws --resources=alb,ec2_instance,eip,ebs,igw,nat,rds,route_table,sg,subnet,vpc,vpc_peering --regions=ap-northeast-1 --profile=beyond-poc-admin --path-pattern {output}/{provider}/develop/ --filter="Name=tags.Environment;Value=dev" $ terraformer import aws --resources=alb,ec2_instance,eip,ebs,igw,nat,rds,route_table,sg,subnet,vpc,vpc_peering --regions=ap-northeast-1 --profile=beyond-poc-admin --path-pattern {output}/{provider}/production/ --filter="Name=tags.Environment;Value=prod" $ terraformer import aws --resources=ec2_instance,eip,ebs,igw,route_table,sg,subnet,vpc,vpc_peering --regions=ap-northeast-1 --profile=beyond-poc-admin --path-pattern {output}/{provider}/manage/ --filter="Name=tags.Environment;Value=mng"

After the import is complete, the directory structure will look like this:

Directory Structure

$ tree . └── aws ├── develop │ ├── db_instance.tf │ ├── db_parameter_group.tf │ ├── db_subnet_group.tf │ ├── eip.tf │ ├── instance.tf │ ├── internet_gateway.tf │ ├── lb.tf │ ├── lb_target_group.tf │ ├── nat_gateway.tf │ ├── outputs.tf │ ├── provider.tf │ ├── route_table.tf │ ├── security_group.tf │ ├── subnet.tf │ ├── terraform.tfstate │ ├── variables.tf │ └── vpc.tf ├── manage │ ├── instance.tf │ ├── internet_gateway.tf │ ├── outputs.tf │ ├── provider.tf │ ├── route_table.tf │ ├── security_group.tf │ ├── subnet.tf │ ├── terraform.tfstate │ ├── variables.tf │ ├── vpc.tf │ └── vpc_peering_connection.tf └── production ├── db_instance.tf ├── db_parameter_group.tf ├── db_subnet_group.tf ├── eip.tf ├── instance.tf ├── internet_gateway.tf ├── lb.tf ├── lb_target_group.tf ├── nat_gateway.tf ├── outputs.tf ├── provider.tf ├── route_table.tf ├── security_group.tf ├── subnet.tf ├── terraform.tfstate ├── variables.tf └── vpc.tf 4 directories, 45 files

The resources are divided by environment.
If you take a look at vpc.tf under develop, you will see that only the VPC for the corresponding environment is imported.

develop/vpc.tf

resource "aws_vpc" "tfer--vpc-002D-0eea2bc99da0550a6" { assign_generated_ipv6_cidr_block = "false" cidr_block = "10.1.0.0/16" enable_classiclink = "false" enable_classiclink_dns_support = "false" enable_dns_hostnames = "true" enable_dns_support = "true" instance_tenancy = "default" tags = { Environment = "dev" Name = "vpc-terraformer-dev" } }

As for tfstate, all the resources for each environment are recorded in one file, so this doesn't seem to be a problem either.
The contents are long, so I won't go into detail here.

Concerns

There are some places where resource ID values ​​are hard-coded


There are several places where resource ID values ​​are hard-coded, such as the vpc_id of the aws_security_group below

resource "aws_security_group" "tfer--alb-002D-dev-002D-sg_sg-002D-00d3679a2f3309565" { description = "for ALB" egress { cidr_blocks = ["0.0.0.0/0"] description = "Outbound ALL" from_port = "0" protocol = "-1" self = "false" to_port = "0" } ingress { cidr_blocks = ["0.0.0.0/0"] description = "allow_http_for_alb" from_port = "80" protocol = "tcp" self = "false" to_port = "80" } name = "alb-dev-sg" tags = { Environment = "dev" Name = "alb-dev-sg" } vpc_id = "vpc-0eea2bc99da0550a6" }

When writing a new HCL, you would dynamically reference it like "vpc_id = aws_vpc.vpc.id", but
it seems that it is still difficult to complete it that way when importing.
This part is already recorded in tfstate, so you only need to modify the tf file.

The description of terraform_remote_state does not correspond to 0.12

There are some places where the HCL description is from Terraform 0.11, such as the vpc_id of aws_subnet below

resource "aws_subnet" "tfer--subnet-002D-02f90c599d4c887d3" { assign_ipv6_address_on_creation = "false" cidr_block = "10.1.2.0/24" map_public_ip_on_launch = "true" tags = { Environment = "dev" Name = "subnet-terraformer-dev-public-1c" } vpc_id = "${data.terraform_remote_state.local.outputs.aws_vpc_tfer--vpc-002D-0eea2bc99da0550a6_id}" }

If you apply it in this state on the 0.12 series, it will run, but a warning will be generated

Also, if you change the --path-pattern option to separate directories for each environment,
the tfstate itself will be output in a single file, but
when referencing resources within the tf file, they will still be referenced using terraform_remote_state.
Looking at GitHub, there is the following statement, so this is a specification.

Connect between resources with terraform_remote_state (local and bucket).

In the case of the above vpc_id, aws_vpc and aws_subnet are recorded in the same tfstate, so
you can simply refer to it with "vpc_id = aws_vpc.tfer--vpc-002D-0eea2bc99da0550a6.id".
It seems that you will also need to fix this part yourself.

summary

What did you think?
It seems like importing existing infrastructure will be easier with terraformer.
There are a few concerns, but the fixes aren't too difficult, so I think they're within the acceptable range.
I've always found terraform import difficult, so
I really admire the Waze SRE team for creating a tool that solves that problem so well. I encourage
everyone to give it a try.

If you found this article useful, please click [Like]!
1
Loading...
1 vote, average: 1.00 / 11
19,256
X Facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Yuki Teraoka

He joined Beyond in 2016 and is currently in his sixth year
as an infrastructure engineer working for an MSP, where he troubleshoots issues while also designing
and building infrastructure using public clouds such as AWS. Recently, he has been

using Hashicorp tools such as Terraform and Packer as part of the construction and operation automation
of container infrastructure such as Docker and Kubernetes, and he also plays the role of evangelist, speaking at external study groups and seminars.

・GitHub
https://github.com/nezumisannn

・Speaking history
https://github.com/nezumisannn/my-profile

・Presentation materials (SpeakerDeck)
https://speakerdeck.com/nezumisannn

・Certification:
AWS Certified Solutions Architect - Associate
Google Cloud Professional Cloud Architect