Terraform

Terraform Files – How to Structure a Terraform Project

How to structure a Terraform project?

Starting a new Terraform project – as exciting as it may sound – the first question is where and how do we begin? What should be the first file that needs to be created? When the project grows, we realize a few things and learn our lessons about structuring a project in a certain way from the beginning, but it is too late to put in refactoring efforts.

Various aspects influence the way we manage our Terraform config in a repository. In this post, we will learn about them and discuss a few important strategies and best practices around structuring Terraform project files in an efficient and standardized way.

Basic Concepts and File Types

Any Terraform project is created to manage infrastructure in the form of code i.e., IaC. Managing the Terraform IaC deals with the following at the least.

  1. The cloud platform of choice, which translates to the appropriate provider configurations.
  2. Various resources, i.e., cloud components.
  3. State file management.
  4. Input and output variables.
  5. Reuse of modules and associated internal wiring.
  6. Infrastructure security standards.
  7. Developer collaboration and CI/CD workflow.

To begin writing a Terraform configuration while adhering to the best practices, we create the files below in the project’s root directory.

  1. provider.tf – containing the terraform block, s3 backend definition, provider configurations, and aliases.
  2. main.tf – containing the resource blocks which define the resources to be created in the target cloud platform.
  3. variables.tf – containing the variable declarations used in the resource blocks.
  4. output.tf – containing the output that needs to be generated on successful completion of “apply” operation.
  5. *.tfvars – containing the environment-specific default values of variables.

The files are not required to have the exact same names listed above. However, these are general conventions used in Terraform projects. Scenarios which need additional files are not surprising. For example, if the Terraform project is created to manage many resources, then many lines of code are included in the main.tf file – thus making it difficult to navigate.

There are multiple ways to supply the values for variables using .tfvars, apart from supplying them as CLI arguments and environment variables. Terraform loads the values from the terraform.tfvars file by default.

Similar behavior is achieved by using custom file names with *.auto.tfvars extension. However, for any other file name, it has to be passed explicitly using -var-file argument in the CLI.

The order of precedence between these files is as shown below.

Terraform.tfvars > *.auto.tfvars  > custom_file_name

The screenshot below shows the directory structure of a Terraform project when we start writing the configurations for the first time.

terraform project structure

In such situations, it may make sense to consolidate various components depending on a certain pattern. A couple of examples of slicing the main.tf files are:

  1. By services – the team may include all the components required to support a particular business service in one file. This file includes all the databases, compute resources, network configs, etc., in a single file. The file is named as per the service being supported. Thus, while doing the root cause analysis (RCA), we already know which Terraform file needs to be investigated.
  2. By components – it may be decided to segregate the resource blocks based on the nature of the components used. A Terraform project may have a single file to manage all the databases. Similarly, all network configurations, compute resources, etc., are managed in their individual files.

Note that Terraform does not interpret the .tf files included in the sub-directories. This takes us to the discussion of modules, which we will discuss later in this post. Irrespective of how the Terraform source code is segregated, the intention behind this should be to enable easy analysis and navigation.

One of the advantages of using Terraform to manage infrastructure is consistency. Owing to this, multiple Terraform environments are spun using the same project source code. This makes it easy to create sub-production and ephemeral environments, identical copies of the production. The sub-production environments are usually scaled-down versions of the production. 

This variation is achieved using variables in the variables.tf file. .tfvars files are used to specify the scale of any environment. Thus it is also common to have multiple .tfvars files alongside the rest of the Terraform configs.

For example, if we have to create three environments – prod, qa, and dev, then the following three .tfvars files are created with clear names.

  1. variables-dev.tfvars
  2. variables-qa.tfvars
  3. variables-prod.tfvars

Multiple environments are usually managed using workspaces in Terraform. Depending on the workspace being used the appropriate .tfvars file needs to be used – a manual error here can be risky. Spacelift provides a way to manage workspaces in the form of stacks. A set of environment variables defined in context are associated with each stack. This reduces the risk and promotes the reusability of variable values.

Automatically Managed Files and Directories

The previous section focused mostly on the files we deal with when we begin to work on a Terraform project. In this section, we will see the files created automatically by Terraform when the configurations are tested, applied, and destroyed. The concepts discussed in these sections will help us have a firm understanding that will enable us to structure the Terraform code better.

The first step to testing our configuration is initializing the repository. When we run terraform init, Terraform identifies the “required_providers” and downloads the appropriate plugin binary from the Registry. These binaries are stored in the “.terraform” directory located at the root of the project.

The init action also creates a “.terraform.lock.hcl” file, which maintains the hashes of the downloaded binaries for consistency. We do not interact with these files directly/manually. They are maintained automatically by Terraform.

The screenshot below shows the directory structure after running the init command.

Once the project is initialized, we apply these configurations to create the cloud resources. A apply or destroy operation creates an additional file – terraform.tfstate. This is the Terraform state file, which is critical and automatically managed by Terraform. This file is either managed locally (default backend) or remotely. When working in teams, the remote backend should be used.

Find more details about Terraform’s state in the blog post – Managing Terraform State – Best Practices & Examples.

Given the importance, Terraform also creates the backup file (.terraform.tfstate.backup) for the state.

This is shown on the screenshots below.

terraform project structure with backup file

Terraform implements a locking mechanism that helps avoid race conditions, and prevent state file corruption. The locking mechanism depends on the type of backend used.

For example, when using S3 as a remote backend service, Terraform uses the AWS DynamoDB table to manage the file lock. 

In the case of the local backend, this lock is managed using an additional file that exists for the period of operation (plan, apply, destroy) being performed. Once the operation is completed, the file is removed.

In the screenshot below, we can see the file named “.terraform.tfstate.lock.info” being generated.

terraform tfstate lock info

Additional File Types

In addition to the Terraform configurations, over the period of the project, we may need to add more files to serve various purposes. The list below provides certain examples.

  1. README.md – as a general best practice, every repository should contain a README.md file to include an overview of the source code, usage instructions, and any other information deemed relevant and important.
  2. Automation scripts – when there is a need to include automation scripts (bash, shell, python, golang, etc.) in CI/CD workflow, when certain scripts are required to be executed on the target resource being created, or to build a source code, etc. Bash/shell scripts are very powerful in general; there are many reasons to use them.
  3. YAMLs – the most common usage of YAML files in this context is while implementing CI/CD automation.

.gitignore

Since we are discussing the Terraform project structure, the .gitignore file plays a special role. As observed in previous sections, a Terraform project consists of multiple kinds of files and binaries.

However, not all the files and directories should be part of the git repository for several reasons. Following are some of the files included in the .gitignore file in a generic Terraform project.

  1. .terraform.tfstate – Terraform state files should never be pushed to the git repositories. Note that when using the remote backend, the state files will not be available on the local system. A couple of reasons are:
    • Security – State files may store sensitive details like keys, tokens, passwords, etc. 
    • Collaboration – When working within teams, managing the state file locally by each developer poses a high risk of state files needing to be more consistently overwritten. 
  2. Binaries – the provider plugins downloaded locally or on a Terraform host – in .terraform directory – should not be part of the Git repository. The binaries thus downloaded are large in memory. Pushing and pulling the binaries from a remote git repo is inefficient for using network bandwidth.
  3. Crash.log – Crash log files are not always required, especially when a crash occurs due to the local environment.
  4. *.tfplan – We use the terraform plan command to save and use the output during the apply phase. This information is not required to be stored on a remote git repository.

We have created a .gitignore file in our project repository in the screenshot below. The template is used from Github’s gitignore repository, which also defines the guidelines for writing good .gitignore files.

gitignore

Modules and Dependencies

The sections till now have covered the basic information related to a generic and simple Terraform project. Terraform projects created to manage a smaller set of infrastructure would follow the above structure, and it is enough. However, this simple structure may not be enough for larger projects.

Modules are a great way to follow DRY principle when working with Terraform Infrastructure as Code. Modules encapsulate a set of Terraform config files created to serve a specific purpose. As discussed earlier, we may slice and group the infrastructure based on the type of components or the service they support.

In the diagram below, the Terraform project uses two modules that help to create VPCs and Databases. These modules are created based on the types of components – VPC and Databases. These modules are reusable.

Thus, it becomes easy to plug them into other Terraform projects.

project root directory

The project root directory contains its set of Terraform config files. Apart from the general resource blocks, module blocks are also declared in these config files.

The module blocks refer to the source of these modules – it could be a remote git repository, Terraform Registry, or a locally developed module. For example, the module block below uses a module stored locally in a given path.

module "project_vpc" {
  source = "path/to/vpc/module/directory"

  # inputs (required input variable in VPC module)
  cidr_range = 10.0.0.0/24
}

To know more about how modules work, check out our Terraform Modules tutorial.

Terraform registry is a great resource for finding the modules to be reused. As an example, let us use this VPC module in our project. Add the code below to our main.tf file.

Spacelift’s module registry also helps manage modules in a more easy and maintainable way. In addition to all the features offered by Terraform registry, Spacelift’s module registry is integrated with Stacks, environments, contexts, policies, and worker pools.

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "3.19.0"
}

This is all we need to do to use an already existing module in our project.

When we add a module, the project is reinitialized. By reinitializing the module source code – the module’s Terraform files are downloaded locally in the “.terraform” directory. This is the same directory where provider binaries also exist.

The project directory is shown below after initialization.

Do not get confused about the additional files included in the .terraform directory. They belong to the VPC module we used in our main.tf file.

This is to show how modules are managed internally, and if customization is needed, it is possible to do the same here. Technically, at this moment, we can apply these changes, and a VPC will be created and managed in the state file of our project, i.e., in the project root directory.

It is possible for modules to, in turn, have nested modules. Also, note that all the additional files related to VPC modules downloaded in the .terraform directory are “gitignored” by default. So these will never be pushed to the repository.

Managing Complexity

Organizations that have advanced on their IaC adoption journey typically have a complex set of infrastructures to be managed. Some of the key aspects which contribute to this complexity and their remedies are:

Complex infrastructure requirement

The infrastructure design may consist of many advanced components interlinked with each other, redundant backup systems, complex network and firewall requirements, etc. For such projects, or when the projects eventually become complex, the use of modules is suggested.

With modules, breaking the Terraform IaC monolith into manageable and relatable sets is possible.

Strict security controls

When a project grows, the security requirements also grow exponentially as the attack surface grows, and various combinations are to be addressed to mitigate the risks.

It also makes sense to integrate a policy-as-code solution with Terraform IaC for complex security guardrails to go a step ahead.

Standardization

To avoid reinventing the wheel, especially on larger projects, it makes sense to follow a modular approach and standardize the implementation of components for reuse.

A central repository to develop and host Terraform modules that fundamentally address the organization’s policies is of great value. The projects using these modules can readily get going without worrying about adhering to company practices.

Such modules should be developed by center of excellence (COE) teams responsible for addressing standardization initiatives and enabling customer-facing teams to accelerate delivery.

Multiple environments

Managing identical and scaled-down copies of the production environment to facilitate development and quality analysis. The ability to create temporary, partially usable infrastructure is also desired.

Regional deployments

The need to manage multiple deployments and the ability to serve the users with custom features and services. This could grow into more complex requirements.

Key Points

Structuring a Terraform project is that aspect of IaC adoption that is realized in the later stages when the infrastructure design tends to grow. It is important to understand the files created and automatically generated by a generic Terraform project setup so that the customizations are implemented in a way that is more maintainable and easy.

Note: New versions of Terraform will be placed under the BUSL license, but everything created before version 1.5.x stays open-source. OpenTofu is an open-source version of Terraform that will expand on Terraform’s existing concepts and offerings. It is a viable alternative to HashiCorp’s Terraform, being forked from Terraform version 1.5.6. OpenTofu retained all the features and functionalities that had made Terraform popular among developers while also introducing improvements and enhancements. OpenTofu works with your existing Terraform state file, so you won’t have any issues when you are migrating to it.

In this post, we discussed how complexity quickly increases in growing projects. We also discussed a few approaches to manage the same via structuring the files based on services being supported or the nature of infrastructure components being managed.

Spacelift helps simplify these challenges to a large extent by providing CI/CD automation around infrastructure management. Infrastructure Stacks are created based on Git repositories, which lets the developers focus on their IaC development by taking care of automatically provisioning the infrastructure after PR merges.

Along with state file management, Spacelift uses Contexts, which are similar to “reusable environment variables”. Once set, it is possible to associate contexts to multiple stacks readily. Spacelift also manages the module registry, which is integrated with Stacks, policies, worker pools, etc. 

Terraform Management Made Easy

Spacelift effectively manages Terraform state, more complex workflows, supports policy as code, programmatic configuration, context sharing, drift detection, resource visualization and includes many more features.

Start free trial
Terraform Essential Components Cheatsheet

Whenever you're embarking on a new journey or seeking to refine your foundational knowledge.

Share your data and download the cheatsheet