In the last decade, infrastructure has evolved from a static and rigid asset to a programmable resource. Infrastructure as Code (IaC) has become so popular that the majority of organizations have adopted it, but this significant power entails considerable responsibility. IaC, while it helps with many aspects of one’s workflow, it also becomes hard to manage in time.
The shift to IaC was just the beginning, but surely management platforms for IaC will become a standard in the next two years.
One of the most popular IaC tools is Terraform and one of the platforms that helps manage your Terraform code is Terraform Cloud (TFC).
In this article, we will cover:
- What is Terraform Cloud
- Terraform Cloud features
- Terraform vs Terraform Cloud
- Terraform Cloud vs Terraform Enterprise
- Terraform Cloud benefits
- Terraform Cloud cost
- How to create a Terraform Cloud account
- Terraform Cloud workflows
- Terraform Cloud getting started (Tutorial)
- Publishing a module to the private registry with Terraform Cloud
- Terraform Cloud alternative
Terraform Cloud is a platform developed by Hashicorp that helps with managing your Terraform code. It is used to enhance collaboration between developers and DevOps engineers, simplify your workflow and improve security overall around the product.
1. Workspaces
Workspaces are the building blocks of Terraform Cloud. As you’ve seen before, they are used in the workflows you are defining and they are linked to a specific Terraform configuration. In a nutshell, they are responsible for:
- Storing your state
- Executing Runs – initialization, plans and applies for your configuration code
- Storing Environment Variables – Populate variables in your Terraform configurations
Access control can be defined at the workspace level, giving you the ability to control which users and teams can read/write/administer them.
You can run remote operations against the TFC workspaces as mentioned in the CLI-driven workflow. Even though you are running the commands on your local machine, the workspace will be in charge of doing the heavy lifting.
2. Projects
Projects, in Terraform Cloud, are simply containers for the workspaces. All workspaces belong to a project, and this is useful because you can group them easier. These entities exist to make it easier to assign workspace accesses for the different teams you will have inside your organization.
3. Run
A run in Terraform Cloud manages the lifecycle of a Terraform operation that is happening against your Workspace. The typical process a run goes through is:
- Queuing → A run will be queued until it can be picked up by an available TFC worker
- Planning → Run a terraform plan against your workspace
- Cost Estimation → Shows you a cost estimate for your resources
- Policy Checking → If policies are enabled for your workspace, it will check to see if anything is violating what you have put in place
- Applying → If the plan and policy checking are successfully done, the code will get applied
4. Variables and Variable Sets
Terraform Cloud stores your variables securely, encrypting them at rest. You have the possibility to provide plain-text values to your variables and also sensitive values. The sensitive ones will be write-only.
Variable Sets, on the other hand, help you reuse variables in multiple workspaces, without having to declare them multiple times. You will create the variable set once, declare variables inside of it, and after that attach it to your workspaces.
5. Policies and Policy Sets
With Terraform Cloud, you get Sentinel and OPA policies to enforce security practices and governance throughout your workflow. These policies can also help with cost management and even restrict some parameters a resource may have.
Policy Sets are pretty similar to Variable Sets in the sense that they group together multiple policies and can be applied to multiple workspaces at once.
6. Run Tasks
In Terraform Cloud, you can integrate with third-party tools by using run-tasks. By leveraging this feature, you are sending data to external services at defined stages of the run. The data is analyzed and based on the response and the enforcement level set for the task, Terraform Cloud gets a response and decides if the run can continue or not.
There are currently 28 run tasks available, so if you want to integrate with a tool that’s not in the official run-task registry, this will be almost impossible.
7. Single Sign-On (SSO)
You can achieve SSO inside Terraform Cloud by using an identity provider such as Okta, SAML , or Microsoft Azure AD. If you enable SSO for your organization, all users that are not admins will have to sign in through SSO to access it. Admins will still be able to access the platform with their normal credentials in order to solve any issues that may happen with the SSO configuration.
8. Remote State
When you are using Terraform locally, the state file is written to your local filesystem. Whenever you are working in a team, this approach is not going to cut it, as everybody working on the project will need to access that state. Remote State solves this issue, and Terraform Cloud has its own backend for storing the state.
If you are using the recommended VCS approach for your workflow, you won’t need to define any configuration as Terraform Cloud will make you use their backend.
However, if you are using the CLI approach, you will need to define the configuration as shown above.
9. Private Registry
Terraform Cloud offers a private registry feature for hosting your Terraform modules and providers. It works in the same way as the public Terraform registry does, giving you the ability to version modules and providers, and rendering the documentation of them. As they are private, they will only be accessible by the members of the organization that have access to them based on their permissions.
10. Agents
In Terraform Cloud, by default, all of your runs will happen on a public worker, but what if your organization wants to have the runs isolated inside their own environment? Well, that’s why you have the possibility of using self-hosted agents. These agents allow your Terraform Cloud instance to communicate with isolated networks, giving the possibility to the compute instances or containers to act as runners for your workflows.
11. Drift Detection
In the IaC world, there will always be drift inside your configurations. Sometimes, it may seem easier to fix things manually rather than fixing the code that raised the issue itself. This approach should be avoided at all costs, but in some cases, you may need a quick fix, so you will resort to manually fixing something with the thought of fixing the code later on. We tend to forget that we’ve made these changes, so even though we fixed something now, we’ve set up the stage for a grandiose failure in the future.
This is where drift detection comes into play. It periodically checks your Terraform configuration and the actual Terraform state, and the difference between these will actually be the drift. Once it identifies the drift, you will be notified in detail about the differences, and you will be able to quickly fix the issue. This will ensure the consistency of your infrastructure over time and help you prevent a lot of potential issues.
Terraform, also known as Terraform OSS, is the open-source version of Terraform. You don’t need to pay anything to use it, but still, in order to have a functioning workflow, you will need to have a CI/CD platform in place and somewhere to store the remote state which will incur costs. Terraform Cloud is a product that leverages Terraform, acts as a CI/CD platform for your code, and also stores the state for you.
The main difference between Terraform Cloud and Terraform Enterprise is the fact that Terraform Cloud is a SaaS product, while Terraform Enterprise is self-hosted. Of course, whenever there is a new feature released in the SaaS product, it will take some time for it to reach the Enterprise version too (at least a month).
Having a self-hosted product can be beneficial for an organization that needs to have isolated networks and better governance overall.
Check out top Terraform Enterprise alternatives.
Terraform Cloud is a TACOS (Terraform Automation and Collaboration Software) product, and usually, these products have the following benefits:
- Role-Based Access Control → Controls who can access and run jobs on certain workspaces
- Remote State Management → Secure storage and management of the state, enabling collaboration and avoiding conflicts through a locking mechanism
- Policy as Code → Blocking runs or giving warnings if established conditions are violated
- Version Control Integration → Using VCS as the source of truth for your workspaces will ensure better management of the lifecycle of your Infrastructure.
- Observability → Better visibility of what is happening in your workflow, seeing all the runs that have happened against a particular workspace helps with accountability and speeds up the debugging process
- Drift Detection → Knowing when things are changed outside your code is crucial for reliability.
Terraform Cloud recently switched to a Resource Under Management (RUM) model so pricing is pretty hard to estimate. Every resource in the state will incur a cost, and you have to keep in mind that even a security group rule is a resource.
They have three plans for this version:
- Free → up to 500 resources
- Standard → pay $0.00014 per hour / per resource for every resource that passes the 500 threshold
- Plus → you will need to contact their sales team to get a quote
There are a couple of limitations with the Free and Standard tier that will most likely drive their customers to the Plus plan, and you can better understand what is happening in this new Terraform Cloud pricing model.
Go to https://app.terraform.io and select create a free account:
You have the possibility to either use your HCP account or create the account by providing a username, password, and email address.
Now that you are done with creating your TFC account, let’s dive into the workflow types you can use. There are two types of workflows that you can define in Terraform Cloud:
- CLI-driven workflow
- VCS-driven workflow
CLI-driven workflow
The CLI-driven workflow brings Terraform Cloud’s features into Terraform’s workflow. It doesn’t depend on a VCS, but still, it is recommended to use a VCS to hold your Terraform configuration to facilitate rollback in case of errors, and you should also have versioning enabled.
With this method, you can use all of Terraform Cloud’s features, and you won’t have to worry about managing your state.
To use the CLI-driven approach, you will need to first log in to Terraform Cloud:
terraform login
This will redirect you to your browser and ask you to create a user token. You can add a description to this token and set up an expiration date.
After you are generating this token, you will need to return to your terminal and paste it back into the Terraform login prompt. This token will not be displayed again so you will need to store it securely.
Now that you have logged in successfully, the next step would be to write the Terraform code and set up the Terraform Cloud backend for managing your state.
terraform {
cloud {
organization = "saturnhead"
workspaces {
name = "random_pet"
}
}
}
resource "random_pet" "this" {
length = 2
separator = "-"
}
You will need to provide your organization name and also a workspace name for the cloud block to function properly. The workspace will be automatically created for you in Terraform Cloud.
Let’s see a plan in action:
cli-workflow git:(terraform_cloud) ✗ terraform plan
Running plan in Terraform Cloud. Output will stream here. Pressing Ctrl-C
will stop streaming the logs, but will not stop the plan running remotely.
Preparing the remote plan...
To view this run in a browser, visit:
https://app.terraform.io/app/saturnhead/random_pet/runs/run-ZvdBwGKvZZFjfvB3
Waiting for the plan to start...
Terraform v1.5.3
on linux_amd64
Initializing plugins and modules...
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# random_pet.this will be created
+ resource "random_pet" "this" {
+ id = (known after apply)
+ length = 2
+ separator = "-"
}
Plan: 1 to add, 0 to change, 0 to destroy.
------------------------------------------------------------------------
Cost Estimation:
Resources: 0 of 0 estimated
$0.0/mo +$0.0
In this way, you can run supported terraform commands against your configuration without having to leave the CLI.
VCS-driven workflow
The VCS-driven workflow will actually be the best way to use Terraform Cloud from many points of view. Using VCS as a single source of truth will help achieve GitOps and will help with minimizing human errors and configuration drift.
To use this approach, you will need to create a VCS repository, add your Terraform code to it, and after that create a Terraform Cloud workspace.
Let’s suppose you already have the VCS repository in place and you want to create a workspace for it. From your Terraform Cloud account, go to Project & Workspaces and select the New Workspace option.
In the next screen, Choose Version control workflow as shown above and then select your VCS provider.
After you are done selecting your VCS provider, you will need to choose the repository in the next screen, and you also have the option to filter them:
In the last screen, you can configure which branch you want to use, the working directory, and others.
We will deep-dive into how to use Terraform Cloud’s VCS-driven workflow later on in the post, when we will discuss a real-life example.
Download The Practitioner’s Guide to Scaling Infrastructure as Code
Prior to this, you’ve seen some examples of CLI and VCS-driven workflows using some demo code. Let’s look at a more realistic example for AWS that generates a network configuration.
The code can be found here.
Step 1: Log in to Terraform Cloud
Before diving into the code and creating the workspace, we first need to log in to our Terraform Cloud organization.
Go to https://app.terraform.io and sign in with one of the presented options:
Step 2: Create a credentials variable set
After you have logged in successfully, let’s create a credentials variable set for our AWS account. Go to Settings and then select Variable sets as shown below:
Click on Create Variable Set, and configure it accordingly:
- Add a name
- Description (Optional)
- Choose the set scope
- Add Variables
For the set scope, I chose to apply it globally to all of the workspaces, as I want to ensure that all of my workspaces will be able to connect to my AWS account. This is usually not going to be true for a real-life example, so that’s why you also have the possibility to apply it to specific projects and workspaces.
Variables can be defined as both Terraform variables and environment variables and you even have the option to save them as Sensitive.
For the AWS Credentials variable set, I’ve saved my AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
as environment variables, the latter having a sensitive value:
You will need to use your own set of credentials, and as a best practice, at least the AWS_SECRET_ACCESS_KEY
should be sensitive.
After you are done with this, click on Create variable set, and you should be good to go.
Step 3: Create a policy
From Settings, click on Policies and Create a New Policy.
Select Sentinel, add a name to your policy, and let’s go for hard mandatory, as this type of policy will fail a run if it encounters an error.
You can paste in the following code to check out if VPCs have the “Name” and “owner” tag keys. If these tags are not present, you won’t be able to apply the code as the policy will fail the run.
import "tfplan/v2" as tfplan
vpcs = filter tfplan.resource_changes as _, rc {
rc.type is "aws_vpc" and
(rc.change.actions contains "create" or rc.change.actions is ["update"])
}
required_tags = [
"Name",
"owner",
]
vpc_tags = rule {
all vpcs as _, instances {
all required_tags as rt {
instances.change.after.tags contains rt
}
}
}
main = rule {
(vpc_tags) else false
}
To use this policy, you will first have to create a Policy Set and add it to that set. Go back to settings and select Policy Sets → Connect a new Policy set.
The Terraform Cloud account I am using is in the standard tier, so I cannot use versioned policy sets in this account. This creates a couple of issues, as you won’t be able to easily use the sentinel policy library without workarounds.
The good thing is that for this example, you won’t need it, but for production use cases, you will surely need to connect your policy sets via VCS. For now, select “No VCS connection,” and in the next screen, select Sentinel, add a name to the policy, and enforce it on all workspaces.
Step 4: Create a workspace
To create a workspace, we will repeat the same process as for the VCS-driven workflow. You will need to select the repository containing the aws related code.
Initially, the VPCs were missing the owner tag, and this is the error I received:
After modifying the code and adding the owner tag, this passed:
Step 5: Create and make changes to the IaC configuration
Let’s apply the above run and see the result:
You can see all the resources have been created successfully and the outputs we have defined are in place.
Let’s add another VPC to the configuration and ensure it has the correct tags. To showcase resource changes too, let’s also modify a tag value for the second VPC. If you are using the code shown above, you will simply need to add another map in the vpc_parameters
variable in the tfvars file.
In the end, it should look like this:
vpc_params = {
vpc1 = {
cidr_block = "10.0.0.0/16"
tags = {
Name : "vpc1"
owner : "owner1"
}
}
vpc2 = {
cidr_block = "11.0.0.0/16"
tags = {
Name : "vpc2"
owner : "owner2"
}
}
vpc3 = {
cidr_block = "12.0.0.0/16"
tags = {
Name : "vpc3"
owner : "owner3"
}
}
}
After pushing the code, a plan job starts automatically:
We can apply the code as there are no policy issues with it, and everything goes smoothly:
Step 6: Destroy the Resources
To destroy the resources, you will have to navigate to your Workspace’s settings and click destruction and deletion. After that, click on Queue Destroy plan:
After clicking the Queue destroy plan button, you will need to add the workspace name in order to actually confirm the deletion:
You will then need to confirm the destruction plan, as this will really delete everything.
After confirming, you are going to see that everything gets deleted and at the end, you will get a summary of what has happened:
The process of publishing a module is pretty straightforward.
Initially, you will need to go to the Registry tab and select the publish option:
Then, you should choose the VCS provider and the repository:
After that, you can click on publish module and you will be redirected to the module itself.
On this page, you can see all the details related to the module itself:
- Readme → It is always a best practice to write documentation for your code to explain what it does, how you can use it, and what limitations it has
- Inputs → All the input variables that you will need to provide for the module to work
- Outputs → What resources are exported from the module and can be used in other configurations
- Dependencies → External modules that this module references. A module is considered external if it isn’t within the same repository.
- Resources → What resources this module is going to create
The module registry uses git tags for versioning, and as you can see in the above screenshot, this module is currently on version 1.0.7.
On the right-hand side of the same view, you will get other details about how to use the module and metrics related to the number of downloads:
To use the module in a configuration, the same process applies. You will need to copy what you see in the usage instructions and also add the necessary inputs.
Let’s take a look at the inputs to understand exactly what we have to add:
There will always be two types of inputs (required and optional). In this example, the kube_params is a required input, and it has a complex data type of map(object) with other parameters inside (some required, some optional), and tags is an optional map(string) input.
An example configuration that leverages this module will be similar to:
module "aks" {
source = "app.terraform.io/saturnhead/aks/az"
version = "1.0.7"
# insert required variables here
kube_params = {
kube1 = {
name = "kube1"
rg_name = "rg1"
rg_location = "westeurope"
dns_prefix = "kube"
identity = [{}]
enable_auto_scaling = false
node_count = 1
np_name = "kube1"
export_kube_config = true
}
}
}
If you are looking for a TACOS that has more features than Terraform Cloud, support other tools apart from Terraform, such as Terragrunt, Pulumi, CloudFormation, Kubernetes, and Ansible and is also cost-effective, Spacelift is the answer.
Spacelift has a fully customizable workflow giving you the ability to control what happens before and after every phase the runner goes through, and you have full flexibility on the third-party tools you want to integrate thanks to custom inputs and the notification policy.
With Spacelift’s policies, you can control various decision points inside the application, implementing powerful guardrails that ensure reliability.
When we are talking about scheduling in Terraform Cloud, we are only speaking about drift detection. At Spacelift, apart from drift detection, you can schedule tasks and also Stack deletion.
Spacelift’s module registry does everything that Terraform Cloud’s registry is doing, and it also gives you the possibility of testing the module to ensure everything is working properly.
Self-service infrastructure can be easily implemented with Spacelift’s blueprints.
If you are already on Terraform Cloud and want to migrate away, Spacelift has got you covered with the Migration kit:
You can see a more detailed comparison between the products here.
TACOS and more generic IaC management and orchestration platforms will become an industry standard in the next two years, and Terraform Cloud offers great benefits when it comes to managing your Terraform workflow.
Spacelift is a great alternative that serves to help not only with your Terraform workflow but also with other IaC products, having a more diverse feature set and being cost-effective.
Using TACOS will greatly improve collaboration between teams, reduce risks associated with deployments and drift, and of course, reduce businesses’ time to market.
If you don’t have a Spacelift account and you want to discover how flexible the platform is, you can start a free trial or book a demo with one of our engineers.
Manage Terraform Better with Spacelift
Build more complex workflows based on Terraform using policy as code, programmatic configuration, context sharing, drift detection, resource visualization and many more.