Every time we provision a new set of cloud infrastructure, there is a purpose behind it. We do not create cloud infrastructure just for the sake of it. There are more actions performed on it to make it functional and useful.
For example, when we create an EC2 instance, we create it to accomplish certain tasks. Maybe the EC2 instance is responsible for executing heavy workloads, acts as a bastion host, or simply serves as the frontend for all incoming requests. Clearly, there are more actions to be performed on this instance – installing a web server, applications, databases, set network firewall, etc – to enable it for its function.
Terraform is a great IaC tool that helps us build infrastructure using code. Additionally, it is also possible to perform some of the above tasks when the EC2 instance boots or is destroyed. Such tasks are performed using provisioners in Terraform. In this post, we will understand the scenarios handled by provisioners, how they are implemented, and what are the better ways to do it.
But before we go ahead, it is worth noting that using Terraform Provisioners for the activities described in this post should be considered a last resort. The main reason here is that there are dedicated tools and platforms available that align well with the use cases discussed in this post. Hashicorp suggests the usage of Terraform provisioners should only be considered in those cases where we are left with no other option. More about this is described in the concluding section.
Provisioning mainly deals with configuration activities that happen after the resource is created. It may involve some file operations, executing CLI commands, or even executing the script. Once the resource is successfully initialized, it is ready to accept connections. These connections help Terraform log into the newly created instance and perform these operations.
The diagram below represents various types of provisioners you can implement using Terraform at various stages of provisioning.
In the entire plan-apply-destroy cycle of Terraform, provisioners are employed at various stages to accomplish certain tasks. The local-exec provisioner is the simplest provisioner as it executes on the machine that hosts and executes Terraform commands. If the Terraform is installed on the developer’s local machine, the local-exec provisioner would run on the same machine.
It is simply because, unlike remote-exec and file provisioners, local-exec provisioners do not require connecting to the newly created resources to perform their tasks. Local-exec provisioner executes the commands or scripts on the host system and works on the data generated by the given Terraform configuration or data made available on the host machine.
As far as the target resources are concerned, we have to set up certain mechanisms to provide connection details to perform actions on the target machines. This is because the credentials used to log in to an EC2 instance are AWS key pairs (public and private keys) primarily.
We will take a look at these provisioners in detail in the next sections.
The local-exec provisioner works on the Terraform host – where Terraform configuration is applied/executed. It is used to execute any shell command. It is used to set or read environment variables, details about the resource which is created, invoke any process or application, etc.
If we ship any shell script along with the Terraform config, or if the shell scripts are already available on the host to be invoked, then local-exec provisioners are used to execute the same.
In the example below, we create an EC2 instance in AWS. It makes use of a local-exec provisioner to save the private_ip address of the instance which is created in a text file. This provisioner executes in the same working directory where terraform apply
is run once the provisioning is successful.
resource "aws_instance" "my_vm" {
ami = var.ami //Amazon Linux AMI
instance_type = var.instance_type
provisioner "local-exec" {
command = "echo ${self.private_ip} >> private_ip.txt"
}
tags = {
Name = var.name_tag,
}
}
Once this configuration is applied successfully, we find a new file being created in the project directory.
The contents of the private_ip.txt file are as expected.
It is important to note that the command executes once the provisioning task is successful.
It is possible to specify when the provisioners should run. Terraform mainly performs two operations – apply and destroy. If we want to run the provisioner to handle some logic at creation time, then we use the creation-time provisioner. Similarly, if we want to handle the destroy-time scenario differently, we use the destroy-time provisioners.
The “when” attribute used in the provisioner block determines whether a provisioner is creation-time or destroy-time. By default, if the “when” attribute is not specified, the provisioner runs at creation time. In the example below, we create separate text files that contain event-specific messages for both create and destroy events.
resource "aws_instance" "my_vm" {
ami = var.ami //Amazon Linux AMI
instance_type = var.instance_type
provisioner "local-exec" {
command = "echo 'Creation is successful.' >> creation.txt"
}
provisioner "local-exec" {
when = destroy
command = "echo 'Destruction is successful.' >> destruction.txt"
}
tags = {
Name = var.name_tag,
}
}
Try to apply and destroy the above Terraform configuration. This should generate two text files in respective order of operations – creation.txt and destruction.txt – in the project directory as below.
With text messages as below.
Note: Artifacts generated using provisioners are not managed via the Terraform state file.
Before we proceed to the next sections, it is important to discuss the connection block. The file provisioner and remote-exec provisioners – both operate on the target resource that is created in the future. To enable Terraform to SSH into our Linux-based EC2 instance, we need a couple of things:
- AWS key pair
- Security group to open up the HTTP access
Navigate to the AWS console and manually create a key pair and save the private key file locally – on the Terraform host. I have created the key pair and named it “tfsn”. The name of the key file downloaded locally on my machine is “tfsn.cer”.
This information is used by Terraform provisioners to SSH into the EC2 instance. Additionally, we would use this key pair to SSH into the EC2 instance ourselves for validation purposes.
In the Terraform configuration, add the configuration for a new security group that enables HTTP traffic from the internet to access it via browser and SSH login required by provisioners. We would need this for validation when we discuss the remote-exec provisioner. Below is an example configuration of the security group in Terraform.
resource "aws_security_group" "http_access" {
name = "http_access"
description = "Allow HTTP inbound traffic"
ingress {
description = "HTTP Access"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH Access"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "http_access"
}
}
Note: In the sections that follow, the code snippets may not contain the security group configuration (like variables and provider), but it is assumed to be present.
The file provisioner is a way to copy certain files or artifacts from the host machine to target resources that will be created in the future. This is a very handy way to transport certain script files, configuration files, artifacts like .jar files, binaries, etc. when the target resource is created and boots for the first time.
To demonstrate this, we have a file named “letsdotech.txt” which we would like to copy into the home directory of the target EC2 instance. The project directory currently looks like the below. “tfsn.cer” is the private key file we created in the previous section for enabling the Terraform provisioner to SSH into the EC2 instance.
Note: It is recommended to use better mechanisms to manage key files. Making the key file a part of the shared git repository is highly discouraged.
Terraform configuration for the EC2 instance along with file provisioner looks like below. Various attributes are described in the table that follows.
resource "aws_instance" "my_vm" {
ami = var.ami //Amazon Linux AMI
instance_type = var.instance_type
key_name = "tfsn"
security_groups = [aws_security_group.http_access.name]
provisioner "file" {
source = "./letsdotech.txt"
destination = "/home/ec2-user/letsdotech.txt"
}
connection {
type = "ssh"
host = self.public_ip
user = "ec2-user"
private_key = file("./tfsn.cer")
timeout = "4m"
}
tags = {
Name = var.name_tag,
}
}
Attribute | Description |
ami
|
The Amazon Linux AMI.
|
instance_type
|
The size of the instance we need. It is currently set to “t2.micro”.
|
key_name
|
Name of the key file as created in the AWS console in the previous section.
|
security_groups
|
Reference to the Security Group name as created in the previous section.
|
provisioner
|
The file provisioner block contains information about the source and destination.
Source – is a path to the file on the Terraform host Destination – is a path on the target EC2 instance, where the source file should be copied
|
connection
|
The connection block used by the file provisioner to SSH into the EC2 instance to copy the file.
Type – specifies the protocol i.e. SSH Host – specifies the public IP address of the EC2 instance that will be created User – Amazon Linux AMIs have ec2-user as the default user Private_key – Path to the private key file named tfsn.cer stored locally Timeout – 4 minutes. If the provisioner is not able to perform the given operation within 4 minutes, it throws an error.
|
tags
|
To name our EC2 instance.
|
When the above configuration is applied, it creates the EC2 instance and we can verify the same in the AWS console. Once the instance is created, the file provisioner copies the text file to the destination path. We can verify the same from the Terraform output after apply.
---------------------Console output--------------------
.
.
.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
aws_security_group.http_access: Creating...
aws_security_group.http_access: Creation complete after 2s [id=sg-0e0d0c032fed5bd1a]
aws_instance.my_vm: Creating...
aws_instance.my_vm: Still creating... [10s elapsed]
aws_instance.my_vm: Still creating... [20s elapsed]
aws_instance.my_vm: Still creating... [30s elapsed]
aws_instance.my_vm: Provisioning with 'file'...
aws_instance.my_vm: Creation complete after 35s [id=i-0ef38d2558210f5c9]
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Outputs:
instance_id = "i-0ef38d2558210f5c9"
public_ip = "3.74.154.19"
------------------------------------------------
Also, let us SSH into the EC2 instance and check if the file exists and the contents of the file.
---------------------------------------------------
[ec2-user@ip-172-31-42-133 ~]$ ls
letsdotech.txt
[ec2-user@ip-172-31-42-133 ~]$ cat letsdotech.txt
Hello, this is Sumeet!
[ec2-user@ip-172-31-42-133 ~]$
---------------------------------------------------
This is as expected. Thus we have successfully used the file provisioner to copy a file from the local machine/Terraform host machine to the newly created EC2 instance. Every time we recreate the EC2 instance using the above configuration, the text file letsdotech.txt would always be made available, thanks to the file provisioner.
The remote-exec provisioners are similar to local-exec provisioners – where the commands are executed on the target EC2 instance instead of Terraform host. This is accomplished by using the same connection block that is used by the file provisioner.
We use a remote-exec provisioner to run a single command or multiple commands. The example below performs a simple task on the EC2 instance that is created by Terraform. Once the EC2 instance creation is successful, Terraform’s remote-exec provisioner logs in to the instance via SSH and executes the commands specified in the inline attribute array.
resource "aws_instance" "my_vm" {
ami = var.ami //Amazon Linux AMI
instance_type = var.instance_type
key_name = "tfsn"
security_groups = [aws_security_group.http_access.name]
provisioner "remote-exec" {
inline = [
"touch hello.txt",
"echo 'Have a great day!' >> hello.txt"
]
}
connection {
type = "ssh"
host = self.public_ip
user = "ec2-user"
private_key = file("./tfsn.cer")
timeout = "4m"
}
tags = {
Name = var.name_tag,
}
}
When we apply the above configuration, we can observe in the Terraform output that first the EC2 instance was created, then the remote-exec provisioner used the connection details to SSH into the instance, performed the tasks, and logged out.
----------------------------------Console output-----------------------------
.
.
.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
aws_security_group.http_access: Creating...
aws_security_group.http_access: Creation complete after 2s [id=sg-012859a7c2879af9b]
aws_instance.my_vm: Creating...
aws_instance.my_vm: Still creating... [10s elapsed]
aws_instance.my_vm: Still creating... [20s elapsed]
aws_instance.my_vm: Still creating... [30s elapsed]
aws_instance.my_vm: Provisioning with 'remote-exec'...
aws_instance.my_vm (remote-exec): Connecting to remote host via SSH...
aws_instance.my_vm (remote-exec): Host: 3.122.228.94
aws_instance.my_vm (remote-exec): User: ec2-user
aws_instance.my_vm (remote-exec): Password: false
aws_instance.my_vm (remote-exec): Private key: true
aws_instance.my_vm (remote-exec): Certificate: false
aws_instance.my_vm (remote-exec): SSH Agent: true
aws_instance.my_vm (remote-exec): Checking Host Key: false
aws_instance.my_vm (remote-exec): Target Platform: unix
aws_instance.my_vm (remote-exec): Connected!
aws_instance.my_vm: Creation complete after 32s [id=i-046efff75f6b96dd9]
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Outputs:
instance_id = "i-046efff75f6b96dd9"
public_ip = "3.122.228.94"
--------------------------------------------------------------------
As a result, when we log into the same EC2 instance, we should have a file named “hello.txt” with a message “Have a great day!” in its contents. Let us verify the same.
--------------------------------------------------------------------------------------------------------------------------------------
[ec2-user@ip-172-31-37-69 ~]$ ls
hello.txt
[ec2-user@ip-172-31-37-69 ~]$ cat hello.txt
Have a great day!
[ec2-user@ip-172-31-37-69 ~]$
--------------------------------------------------------------------------------------------------------------------------------------
In this section, we use Terraform provisioners to install the Nginx web server. Installation of the Nginx web server successfully requires a few commands to download, install, and configure the same correctly.
Instead of supplying these commands in an inline array attribute, we wrap them in a shell file and execute that shell file. This requires us to use the file provisioner to first transport the shell file in the target EC2 instance and then use the remote-exec provisioner to call the same.
Installing the Nginx web server is a relatively simple task, with few commands to execute. However, it gives us an idea of how complex tasks may be performed in a real-world scenario.
To prepare for our example, we first create the shell file named installnginx.sh, with the below contents. It simply updates the registries, installs Nginx, enables the Nginx service, and starts the server.
#!/bin/bash
sudo yum update -y
sudo amazon-linux-extras install nginx1 -y
sudo systemctl enable nginx
sudo systemctl start nginx
The project folder currently has the below files.
Modify the EC2 configuration as below. Here, we have specified the file provisioner as discussed above. The inline commands modify the file permissions of installnginx.sh file, and then execute the same.
resource "aws_instance" "my_vm" {
ami = var.ami //Amazon Linux AMI
instance_type = var.instance_type
key_name = "tfsn"
security_groups = [aws_security_group.http_access.name]
provisioner "file" {
source = "./installnginx.sh"
destination = "/home/ec2-user/installnginx.sh"
}
provisioner "remote-exec" {
inline = [
"chmod 777 ./installnginx.sh",
"./installnginx.sh"
]
}
connection {
type = "ssh"
host = self.public_ip
user = "ec2-user"
private_key = file("./tfsn.cer")
timeout = "4m"
}
tags = {
Name = var.name_tag,
}
}
Apply the above configuration and observe the Terraform output.
----------------------------------------------------------------------------
.
.
.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
aws_security_group.http_access: Creating...
aws_security_group.http_access: Creation complete after 2s [id=sg-09d11628094c8a942]
aws_instance.my_vm: Creating...
aws_instance.my_vm: Still creating... [10s elapsed]
aws_instance.my_vm: Still creating... [20s elapsed]
aws_instance.my_vm: Still creating... [30s elapsed]
aws_instance.my_vm: Provisioning with 'file'...
aws_instance.my_vm: Provisioning with 'remote-exec'...
aws_instance.my_vm (remote-exec): Connecting to remote host via SSH...
aws_instance.my_vm (remote-exec): Host: 3.122.254.50
aws_instance.my_vm (remote-exec): User: ec2-user
aws_instance.my_vm (remote-exec): Password: false
aws_instance.my_vm (remote-exec): Private key: true
aws_instance.my_vm (remote-exec): Certificate: false
aws_instance.my_vm (remote-exec): SSH Agent: true
aws_instance.my_vm (remote-exec): Checking Host Key: false
aws_instance.my_vm (remote-exec): Target Platform: unix
aws_instance.my_vm (remote-exec): Connected!
aws_instance.my_vm (remote-exec): Starting install
aws_instance.my_vm (remote-exec): Loaded plugins: extras_suggestions,
aws_instance.my_vm (remote-exec): : langpacks, priorities,
aws_instance.my_vm (remote-exec): : update-motd
aws_instance.my_vm (remote-exec): Existing lock /var/run/yum.pid: another copy is running as pid 3202.
aws_instance.my_vm (remote-exec): Another app is currently holding the yum lock; waiting for it to exit...
aws_instance.my_vm (remote-exec): The other application is: yum
aws_instance.my_vm (remote-exec): Memory : 79 M RSS (370 MB VSZ)
aws_instance.my_vm (remote-exec): Started: Mon Aug 15 17:23:51 2022 - 00:03 ago
aws_instance.my_vm (remote-exec): State : Running, pid: 3202
aws_instance.my_vm (remote-exec): Another app is currently holding the yum lock; waiting for it to exit...
aws_instance.my_vm (remote-exec): The other application is: yum
aws_instance.my_vm (remote-exec): Memory : 89 M RSS (381 MB VSZ)
aws_instance.my_vm (remote-exec): Started: Mon Aug 15 17:23:51 2022 - 00:05 ago
aws_instance.my_vm (remote-exec): State : Running, pid: 3202
aws_instance.my_vm: Still creating... [40s elapsed]
aws_instance.my_vm (remote-exec): Another app is currently holding the yum lock; waiting for it to exit...
aws_instance.my_vm (remote-exec): The other application is: yum
aws_instance.my_vm (remote-exec): Memory : 143 M RSS (435 MB VSZ)
aws_instance.my_vm (remote-exec): Started: Mon Aug 15 17:23:51 2022 - 00:07 ago
aws_instance.my_vm (remote-exec): State : Running, pid: 3202
aws_instance.my_vm (remote-exec): Another app is currently holding the yum lock; waiting for it to exit...
aws_instance.my_vm (remote-exec): The other application is: yum
aws_instance.my_vm (remote-exec): Memory : 167 M RSS (459 MB VSZ)
aws_instance.my_vm (remote-exec): Started: Mon Aug 15 17:23:51 2022 - 00:09 ago
aws_instance.my_vm (remote-exec): State : Running, pid: 3202
aws_instance.my_vm (remote-exec): Resolving Dependencies
aws_instance.my_vm (remote-exec): --> Running transaction check
aws_instance.my_vm (remote-exec): ---> Package ec2-instance-connect.noarch 0:1.1-15.amzn2 will be updated
aws_instance.my_vm (remote-exec): ---> Package ec2-instance-connect.noarch 0:1.1-19.amzn2 will be an update
aws_instance.my_vm (remote-exec): --> Processing Dependency: ec2-instance-connect-selinux for package: ec2-instance-connect-1.1-19.amzn2.noarch
aws_instance.my_vm (remote-exec): ---> Package ec2-net-utils.noarch 0:1.6.1-2.amzn2 will be updated
aws_instance.my_vm (remote-exec): ---> Package ec2-net-utils.noarch 0:1.7.0-1.amzn2 will be an update
aws_instance.my_vm (remote-exec): ---> Package glibc.x86_64 0:2.26-59.amzn2 will be updated
aws_instance.my_vm (remote-exec): ---> Package glibc.x86_64 0:2.26-60.amzn2 will be an update
.
.
.
(lengthy installation logs)
.
.
.
aws_instance.my_vm (remote-exec): 60 mock2 available [ =stable ]
aws_instance.my_vm (remote-exec): 61 dnsmasq2.85 available [ =stable ]
aws_instance.my_vm (remote-exec): 62 kernel-5.15 available [ =stable ]
aws_instance.my_vm (remote-exec): 63 postgresql14 available [ =stable ]
aws_instance.my_vm (remote-exec): 64 firefox available [ =stable ]
aws_instance.my_vm (remote-exec): Created symlink from /etc/systemd/system/multi-user.target.wants/nginx.service to /usr/lib/systemd/system/nginx.service.
aws_instance.my_vm (remote-exec): Ending install
aws_instance.my_vm: Creation complete after 1m29s [id=i-04da69e10454dcd96]
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Outputs:
instance_id = "i-04da69e10454dcd96"
public_ip = "3.122.254.50"
---------------------------------------------------------------------------------
The Terraform output indicates the sequence of activities it performed to provision this resource and install the Nginx. The activities can be summarized below.
- Created security group
- Created EC2 instance
- Executed file provisioner, which copied the file to the target instance
- Executed remote-exec provisioner, which executed the installnginx.sh file to install Nginx
To verify if the Nginx was installed successfully, open up the browser and access the home page with the public IP address displayed. If you are able to see the Nginx landing page, it means we have successfully used file and remote-exec provisioners to install the same.
Provisioners are great. But there are some limitations which we should consider before using them. As mentioned in the Terraform documentation, provisioners should be used as the last resort to achieve any kind of configuration management tasks possible with them.
When provisioners enable us to execute any command in the future target resource, it means a lot of power and responsibility. It opens up huge scope for activities that can be performed on the OS and application layer. There is no tracking or accountability for these actions.
If, for some reason, the provisioner tasks fail to run on a few machines, just increases the overhead of identifying them and deploying a workaround. Gaining an understanding of why a particular provisioner did not work on a set of machines can be very difficult and is highly hostile. Simply because there are several factors that are potentially unique to each resource.
In a way, provisioners extend into the space of configuration management software but with low confidence. It is recommended to rely on software built for configuration management – like Chef, Puppet, Ansible, etc. – for such tasks. These tools have better control over configuration management, credential management, and better security standards.
For additional support, check out Spacelift, a sophisticated and compliant infrastructure delivery platform that makes Terraform management easy. It brings with it a GitOps flow, so your infrastructure repository is synced with your Terraform Stacks, and pull requests show you a preview of what they’re planning to change. It also has an extensive selection of policies, which lets you automate compliance checks and build complex multi-stack workflows.
Similarly, to pass the data into the target resource, prefer to use the cloud-native way to achieve the same. For example, while provisioning an AWS EC2 instance, the user_data attribute can be used to pass certain scripting data to the instances. These mechanisms depend on cloud-init software – which has become an industry standard – that takes care of the initialization process when the instance boots.
Provisioners have the most influence on the resource during the creation process since the scripts which are run during the instance boot process play a key role in the lifetime of the resource which has just started. If there is certain data, application, patches, etc. that can be pre-configured into a machine image, then prefer to create custom AMIs, over provisioners.
Terraform Management Made Easy
Spacelift effectively manages Terraform state, more complex workflows, supports policy as code, programmatic configuration, context sharing, drift detection, resource visualization and includes many more features.
Terraform CLI Commands Cheatsheet
Initialize/ plan/ apply your IaC, manage modules, state, and more.