Terraform

Terraform Taint, Untaint, Replace – How to Use It (Examples)

Terraform taint, untaint, and replace

Over time, resources that are deployed using Terraform undergo certain configuration changes. The causes may lie in the application layer and the processing logic of various automation tools and processes. This introduces the probability of resources being modified in an undesired manner.

In such situations, it is often desirable to replace these cloud resources with new instances to avoid failures resulting from misconfigurations. Since Terraform manages the cloud resource lifecycle, it also provides a solution to address this problem using CLI commands – taint, untaint, and replace.

In this post, we explore how the CLI commands – taint, untaint, and replace – help address these issues along with what is the right way to approach them with the help of an example.

Creation of EC2 Instances

In the example below, we create two EC2 instances using Terraform. The resources are defined in main.tf.

resource "aws_instance" "my_vm_1" {
ami           = var.ami //Ubuntu AMI
instance_type = var.instance_type
 
tags = {
  Name = "VM 1",
}
}
 
resource "aws_instance" "my_vm_2" {
ami           = var.ami //Ubuntu AMI
instance_type = var.instance_type
 
tags = {
  Name = "VM 2",
}
}
Create EC2 instances

The above diagram represents the process of creating these resources and can be summarized below.

  1. When we apply the configuration, Terraform validates the same with the state file hosted in the remote backend. 
  2. Since these EC2 instances do not exist, it creates them in AWS and names them as “VM 1” and “VM 2” respectively.
  3. After the creation of these instances, Terraform updates the state file with the information about these two instances.

Terraform Taint

Terraform maintains a state file that contains the information regarding the real-world resources managed by Terraform IaC. This is a crucial piece of information, as all the task executions of Terraform depend on this file for coordination.

As established earlier, when a resource becomes misconfigured or corrupt, it is desirable to replace them with a new instance. The taint command updates the corresponding resource state as a “tainted” resource so that in the next apply cycle, Terraform replaces that resource.

To improve your Terraform workflow, see the Terraform Best Practices.

Note: The taint command is deprecated since Terraform version 0.15.2. If you are using a version that is lower than this, continue using taint and untaint. Else, it is recommended to use the replace command discussed further.

Let us assume our development teams and various business users are using these VMs for their purposes. The operations team happens to perform certain changes on these instances, which has caused VM 1 to have undesirable behavior.

The reasons could be anything – the changes in the configuration might have exposed VM 1 to security vulnerabilities, or certain processes are performing particularly slowly because of which the number of requests processed by this EC2 instance is lower as compared to its peers, etc.

Terraform Taint

In this case, instead of recreating the entire infrastructure, it would be easier to replace only VM 1, which causes the issue. We can do this by using Terraform’s taint command. By specifying which resource to be marked as tainted, we let Terraform know about the faulty resource in the state file.

To mark the resource as tainted, run the below command in the terminal. The taint command takes the resource identifier as the parameter to understand which resource needs to be marked as tainted.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform taint aws_instance.my_vm_1
Resource instance aws_instance.my_vm_1 has been marked as tainted.

The output above confirms that VM 1 has been marked as tainted and will be replaced in the next apply cycle. To verify the same, run the terraform plan command and observe the output.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform plan
aws_instance.my_vm_2: Refreshing state... [id=i-023541a5c6d3b18cb]
aws_instance.my_vm_1: Refreshing state... [id=i-026cde214ed10b112]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # aws_instance.my_vm_1 is tainted, so must be replaced
-/+ resource "aws_instance" "my_vm_1" {
      ~ arn                                  = "arn:aws:ec2:eu-central-1:532199187081:instance/i-026cde214ed10b112" -> (known after apply)
      ~ associate_public_ip_address          = true -> (known after apply)
      ~ availability_zone                    = "eu-central-1b" -> (known after apply)
      ~ cpu_core_count                       = 1 -> (known after apply)
      ~ cpu_threads_per_core                 = 1 -> (known after apply)
.
.
.
.
          ~ iops                  = 100 -> (known after apply)
          + kms_key_id            = (known after apply)
          ~ tags                  = {} -> (known after apply)
          ~ throughput            = 0 -> (known after apply)
          ~ volume_id             = "vol-089a00090663cfe71" -> (known after apply)
          ~ volume_size           = 8 -> (known after apply)
          ~ volume_type           = "gp2" -> (known after apply)
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.

At the beginning of the output, we can see that Terraform has taken note of the aws_instance.my_vm_1 as tainted and is prepared to replace the same. At this moment, this is only reflected in the state file, and no action has been performed in AWS.

Run terraform apply now, and see what happens to VM 1.

instances - terraform taint

If we look at the AWS EC2 console, we can see that one EC2 instance named VM 1 is terminated, and another one with the same name is being created. This process is represented in the diagram below.

VM 1 is terminated
  1. The developer marks the resource as tainted using the taint command.
  2. Terraform updates the state file accordingly. The same is seen in the plan output above.
  3. The developer executes the same configuration by running terraform apply.
  4. This destroys tainted VM 1.
  5. This is followed by the creation of a new instance of VM 1.
  6. Terraform updates the state file by removing the older VM 1 object and replacing it with a new VM 1 EC2 instance.

Terraform Untaint

Marking a resource as tainted does not mandate its replacement. If marking a resource as tainted was a mistake, it is possible to “untaint” it using the untaint command. This avoids the replacement of the resource in the next apply execution.

In the current example, let us mark VM 2 as tainted to prove the above. Run the below command in the terminal.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform taint aws_instance.my_vm_2
Resource instance aws_instance.my_vm_2 has been marked as tainted.

To confirm if VM 2 is marked as tainted, run plan command and observe the output as below.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform plan
aws_instance.my_vm_1: Refreshing state... [id=i-00192c5d57ccb9471]
aws_instance.my_vm_2: Refreshing state... [id=i-023541a5c6d3b18cb]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # aws_instance.my_vm_2 is tainted, so must be replaced
-/+ resource "aws_instance" "my_vm_2" {
      ~ arn                                  = "arn:aws:ec2:eu-central-1:532199187081:instance/i-023541a5c6d3b18cb" -> (known after apply)
      ~ associate_public_ip_address          = true -> (known after apply)
      ~ availability_zone                    = "eu-central-1b" -> (known after apply)
      ~ cpu_core_count                       = 1 -> (known after apply)
      ~ cpu_threads_per_core                 = 1 -> (known after apply)
      ~ disable_api_termination              = false -> (known after apply)
.
.
.
.
          ~ throughput            = 0 -> (known after apply)
          ~ volume_id             = "vol-0fd65c8e61226dace" -> (known after apply)
          ~ volume_size           = 8 -> (known after apply)
          ~ volume_type           = "gp2" -> (known after apply)
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.

The EC2 instance named VM 2 is now tainted. If this was a mistake, then we can revert this by using the untaint command as below.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform untaint aws_instance.my_vm_2
Resource instance aws_instance.my_vm_2 has been successfully untainted.

Like taint, a resource identifier is also supplied to untaint command. Confirm the same by running terraform plan command, and making sure the plan does not suggest replacement.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform plan
aws_instance.my_vm_2: Refreshing state... [id=i-023541a5c6d3b18cb]
aws_instance.my_vm_1: Refreshing state... [id=i-00192c5d57ccb9471]

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

Drawback

Tainting a resource until it is replaced by explicitly executing apply command makes the terraform workflow vulnerable in this brief period.

As we have seen in the example above, when we mark the resource as tainted using the taint command, it modifies the state file but does not make sure the changes are applied. In a shared development environment, if another developer generates and executes a plan at this time – the target EC2 instance will be replaced.

This is an unintentional change. The resource was marked as tainted by one developer but was replaced due to the apply statement executed by another developer. 

In our example, we are dealing with only a couple of EC2 instances. The changes thus made are still noticeable and can be avoided. But in real-world implementations where the entire infrastructure is being developed by teams, the risk of avoiding these unintentional replacements increases.

Starting with Terraform version 0.15.2 and onwards, the taint command is deprecated, and it is suggested to use apply command with the -replace flag. Let us discuss this in the next section.

Replace (-replace) - A Preferred Alternative

Replace, as the name suggests, replaces the specified resource. It is a flag that is used with apply command and is a suggested way to perform a recreation of specific resources.

Using replace eliminates the time between marking the resource as tainted and applying those changes. Thus it avoids the drawback discussed in the previous section. Let us replace VM 2 in our example by running the below command.

As mentioned earlier, -replace is a flag attribute provided to the apply command. Its value defines the resource identifier that should be replaced with the existing configuration mentioned in the same Terraform code.

sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace % terraform apply -replace="aws_instance.my_vm_2"
aws_instance.my_vm_2: Refreshing state... [id=i-023541a5c6d3b18cb]
aws_instance.my_vm_1: Refreshing state... [id=i-00192c5d57ccb9471]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # aws_instance.my_vm_2 will be replaced, as requested
-/+ resource "aws_instance" "my_vm_2" {
      ~ arn                                  = "arn:aws:ec2:eu-central-1:532199187081:instance/i-023541a5c6d3b18cb" -> (known after apply)
      ~ associate_public_ip_address          = true -> (known after apply)
      ~ availability_zone                    = "eu-central-1b" -> (known after apply)
      ~ cpu_core_count                       = 1 -> (known after apply)
      ~ cpu_threads_per_core                 = 1 -> (known after apply)
      ~ disable_api_termination              = false -> (known after apply)
      ~ ebs_optimized                        = false -> (known after apply)
      - hibernation                          = false -> null
      + host_id                              = (known after apply)
      ~ id                                   = "i-023541a5c6d3b18cb" -> (known after apply)

.
.
.
.
~ throughput            = 0 -> (known after apply)
          ~ volume_id             = "vol-0fd65c8e61226dace" -> (known after apply)
          ~ volume_size           = 8 -> (known after apply)
          ~ volume_type           = "gp2" -> (known after apply)
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_instance.my_vm_2: Destroying... [id=i-023541a5c6d3b18cb]
aws_instance.my_vm_2: Still destroying... [id=i-023541a5c6d3b18cb, 10s elapsed]
aws_instance.my_vm_2: Still destroying... [id=i-023541a5c6d3b18cb, 20s elapsed]
aws_instance.my_vm_2: Destruction complete after 30s
aws_instance.my_vm_2: Creating...
aws_instance.my_vm_2: Still creating... [10s elapsed]
aws_instance.my_vm_2: Still creating... [20s elapsed]
aws_instance.my_vm_2: Still creating... [30s elapsed]
aws_instance.my_vm_2: Creation complete after 31s [id=i-0099a439074ce7bf4]

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
sumeetninawe@Sumeets-MacBook-Pro tftaintuntaintreplace %

The output above confirms that Terraform replaces the target resource in a single step.

Key Points

I hope this blog post was helpful to you in understanding and exploring the CLI commands – taint, untaint, and replace.

If you are struggling with Terraform automation and management, check out Spacelift. Get started on your journey by signing up for a Free Trial and taking it for a spin worldwide!

Manage Terraform Better with Spacelift

Build more complex workflows based on Terraform using policy as code, programmatic configuration, context sharing, drift detection, resource visualization and many more.

Start free trial