Join experts to dive deep into IaC security and governance on August 27

➡️ Register for IaCConf

OpenTofu

OpenTofu at Scale: 4 Strategies & Scaling Best Practices

opentofu at scale

🚀 Level Up Your Infrastructure Skills

You focus on building. We’ll keep you updated. Get curated infrastructure insights that help you make smarter decisions.

Learning OpenTofu feels straightforward. The familiar HCL syntax makes resource provisioning accessible from day one, but when your infrastructure codebase expands across multiple teams, environments, and complex dependencies, scaling challenges quickly emerge. 

Managing OpenTofu at scale requires strategic workflow planning, robust automation, and careful consideration of collaboration patterns.

This article explores four proven approaches to managing OpenTofu workflows at enterprise scale. You’ll discover their strengths, limitations, and the specific problems each approach solves.

What is OpenTofu?

OpenTofu is the open-source successor to Terraform, forked from version 1.5.7 before HashiCorp’s license changed to BSL. It maintains complete compatibility with existing Terraform configurations while expanding capabilities through community-driven development.

OpenTofu started as a Terraform fork, so it inherits Terraform’s architectural patterns, including its advantages and scaling challenges. 

As your infrastructure grows, you’ll encounter familiar problems: state management complexity, collaboration bottlenecks, and the need for consistent deployment patterns across teams.

1. Local development

Starting with local OpenTofu execution establishes the fundamental workflow patterns. You install the OpenTofu binary, configure provider credentials, and execute commands directly from your development machine. This makes it easy to understand how to work with OpenTofu, facilitating going to the next level.

Consider this typical local workflow:

# Initialize your OpenTofu configuration
tofu init

# Plan changes with variable files
tofu plan -var-file="prod.tfvars"

# Apply with approval
tofu apply -var-file="prod.tfvars"

This approach offers immediate benefits. You maintain direct access to the OpenTofu CLI, enabling quick state operations like tofu import, tofu state mv, and tofu output commands. Debugging becomes straightforward since you control the execution environment completely.

You can enhance local development with tools like pre-commit-tofu, which runs linting and validation before commits. This prevents basic errors from being introduced into your code base.

But let’s be real for a second: You can’t consider local development a management technique at scale. Local environments work for a single developer who wants to do rapid prototyping and learn the OpenTofu fundamentals. In local developments, there are no collaboration or safety mechanisms, race conditions, or state file security concerns.

At the same time, if you’re working alone or in a small team, just starting with OpenTofu, local development provides the fastest path to productivity. However, you’ll typically recognize you’ve outgrown this approach when a second person joins your infrastructure team.

2. The generic CI/CD pipeline route

Building custom OpenTofu automation through your existing CI/CD platform addresses many local development limitations. You create pipelines that execute OpenTofu commands in response to repository events.

Here’s how generic CI/CDs for infrastructure typically look and how they evolve.

Phase 1: Basic Integration

Your CI/CD system executes tofu plan on pull requests and tofu apply on merged changes. This immediately removes manual execution and provides consistent logging. 

Just a small note: Some prefer to do apply before merge, so it depends entirely on how you structure your workflow.

Phase 2: Enhanced workflows

You add linting with tflint, security scanning with tfsec, trivy, or checkov, and formatting checks with tofu fmt

Pull request status checks prevent problematic code from even running tofu plan

Phase 3: Advanced features

You implement drift detection, policy as code, automated testing, and artifact management for OpenTofu modules.

Example GitHub Actions pipeline that runs tofu plan and runs policy checks:

name: OpenTofu plan

on:
 pull_request:
   branches:
     - main
   paths:
     - "**.tf"
     - "**.tfvars"
     - "**.rego"
jobs:
 plan:
   runs-on: ubuntu-latest
   steps:
     - uses: actions/checkout@v4

     - name: Setup OpenTofu
       uses: opentofu/setup-opentofu@v1
       with:
         tofu_version: 1.10.3

     - name: Setup OPA
       run: |
         curl -L -o opa https://openpolicyagent.org/downloads/v0.59.0/opa_linux_amd64_static
         chmod 755 ./opa
         sudo mv ./opa /usr/local/bin/

     - name: Configure AWS Credentials
       uses: aws-actions/configure-aws-credentials@v4
       with:
         aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
         aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
         aws-session-token: ${{ secrets.AWS_SESSION_TOKEN }}
         aws-region: eu-west-1

     - name: OpenTofu Init
       run: tofu init

     - name: OpenTofu Plan
       run: tofu plan -out=tfplan.binary

     - name: Show Plan
       run: tofu show -json tfplan.binary > tfplan.json

     - name: OPA Check
       run: |
         VIOLATIONS=$(opa eval --format raw --data policies/restrict-ec2-type-t2micro.rego --input tfplan.json "data.terraform.deny")
         if [ "$VIOLATIONS" != "[]" ]; then
           echo "Policy violations found:"
           echo "$VIOLATIONS"
           exit 1
         fi

     - name: Add Plan to PR
       run: |
         tofu show -no-color tfplan.binary > tfplan.txt
         echo "\`\`\`hcl" >> plan.txt
         cat tfplan.txt >> plan.txt
         echo "\`\`\`" >> plan.txt
         gh pr comment ${{ github.event.pull_request.number }} -F plan.txt
       env:
         GH_TOKEN: ${{ secrets.GH_TOKEN }}

Pros and cons of generic CI/CD for infrastructure management

Using a generic CI/CD pipeline really addresses some of the biggest challenges that you usually face in local development, but it lacks certain features required for managing infrastructure as code.

  1. Time: Building CI/CD pipelines for infrastructure is time-consuming, they are hard to maintain, and the visibility they offer is tied to certain runs. This makes it harder to tie workflows across tools and processes, and vulnerabilities are hard to control.
  2. State management: Without proper state management and drift detection, your systems will be vulnerable. Your DevOps teams will have a hard time solving issues, resulting in downtime and reduced customer retention.
  3. Policy as code: A built-in policy as code engine is the guardian of your infrastructure. Without the ability to automatically enforce security and compliance rules, every change your team makes increases the risks associated with your business. You can see in the above GitHub Actions example that you still have to implement some logic for policy management, apart from the policy itself.
  4. Observability: Generic CI/CD doesn’t provide observability into your deployed infrastructure resources, their health, and everything associated with configuration management.  These features make it easy for your team to spot and fix issues before they impact you. Without them, your uptime decreases and your operating costs go up. 
  5. Dependencies between workflows: When using a CI/CD tool for infrastructure orchestration, you don’t have an easy mechanism for building dependencies between workflows and sharing outputs. These mechanisms are important because they allow your teams to work faster, safer, and smarter. Without them, it takes longer to get to market, and you risk falling behind your competition.

Some of the above features can be implemented in generic CI/CDs, but they are really hard to manage and maintain.

3. Use open-source infrastructure management platforms

Atlantis transforms OpenTofu collaboration by providing GitOps for infrastructure. It monitors your repository for pull requests and automatically executes OpenTofu commands based on predefined workflows.

Core Atlantis workflow:

  1. DevOps engineer opens a pull request with OpenTofu changes
  2. Atlantis automatically runs tofu plan and comments results on the pull request
  3. Team reviews the plan output directly in the pull request
  4. Approved changes trigger tofu apply via comment commands
  5. Atlantis updates pull request with apply results

Atlantis offers some important features that take your workflows to the next level, such as seamless git workflow integration, automating plan/apply execution, pull request comments interfaces, and webhook-based real-time updates. 

However, gaps include no drift detection, limited policy enforcement capabilities, and the need for custom development for advanced features.

Atlantis provides excellent value for teams prioritizing simplicity and Git workflow integration over advanced management features, but it lacks multi-tool orchestration capabilities.

Read more: Top 10 Atlantis Alternatives

4. Infrastructure orchestration platforms: Spacelift

Advanced orchestration platforms treat OpenTofu as one component in a broader infrastructure delivery pipeline. Spacelift coordinates multiple tools, manages complex dependencies, and provides comprehensive workflow automation.

With Spacelift, you can easily integrate OpenTofu workflows with other tools such as Ansible, Kubernetes, Pulumi, or CloudFormation, create dependencies between them, and share outputs. 

This can extend to the desired number of nested levels, and when a change is made to the parent configuration, the runs will be queued on its children configurations. As soon as the parent finishes their run successfully, the children’s runs also start. In this way, you can solve the terralith problem and do so while combining multiple tools. 

Some simple examples:

  • Create an OpenTofu configuration that spawns EC2 instances, and then send them as an output to an Ansible configuration to be configured.
  • Create an OpenTofu configuration that creates an EKS cluster, and then send the configuration details to a Kubernetes configuration to deploy certain resources inside the newly created cluster.

Spacelift also offers an out-of-the-box policy framework based on Open Policy Agent. With it, you can easily:

  • Control what happens when a pull request is open or merged
  • Control what kind of resources your engineers can create, and what kind of parameters they have
  • Enforce tagging
  • Control how many approvals you need for runs, and what kind of tasks can run using your Stack
  • Control where to send notifications, and even take actions from these notifications in Slack or MS Teams

Spacelift also offers drift detection and remediation, making it easy for you to identify when your configurations were modified outside of your processes and what has changed, and also giving you a solution to fix this and return to your configurations to the single source of truth you should have: your VCS.

With Spacelift, you can also implement self-service infrastructure using Blueprints. These Blueprints are YAML templates your platform/DevOps teams can use to configure everything related to your workflow, including tools, versions, policies, dependencies, drift detection schedules, and more. 

These templates generate a form your developers can easily use to spin up infrastructure without needing to know anything about how they work behind the scenes.

You can also add advanced scheduling to your Blueprints, making it easy to implement ephemeral infrastructure that self-deletes and even recreates based on schedules. Spacelift integrates with ServiceNow using these Blueprints, so your developers are met where they are, without needing to learn a new tool. This enables you to move fast, while staying in control.

If you want to learn more about what Spacelift can do to improve your workflow, read this article. If you want to learn more about what makes Spacelift secure, read this one.

OpenTofu scaling best practices

Regardless of your chosen approach, there are some best practices you should implement when you are using OpenTofu at scale:

  • State management – Always use remote state and implement locking to ensure your configurations are safe. You should also take advantage of OpenTofu’s built-in state encryption.
  • Module architecture and reusability – You should structure your OpenTofu code using modules for consistent, reusable components. These modules should have comprehensive documentation, use semantic versioning, and be easy to use and upgrade.
  • Take advantage of OpenTofu functions – Functions make it easy to work with OpenTofu, as they offer you different ways to build your infrastructure based on your variables and local expressions.
  • Use dynamic blocks when necessary – Dynamic blocks make it easy to implement DRY configurations, but overusing them may unnecessarily complicate the code. If your resources have two static blocks that never change, Dynamic blocks are not necessary, but if you have more complex configurations, you should take advantage of their capabilities
  • Leverage OpenTofu variable validation – Don’t leave anything to chance. Your input variables are very important, and you should ensure that only values that make sense are added to them
  • Implement testing – OpenTofu offers out-of-the-box testing capabilities, so build tests for your IaC code to ensure it does what it is supposed to
  • Implement security vulnerability scanning – While OpenTofu doesn’t offer a mechanism out of the box, make sure you use vulnerability scanners to check for issues with your code, and solve them before merging your changes to the main branch
  • Policy as code – Implement policy as code to ensure your code respects your organization’s guidelines

Key points

OpenTofu’s community-driven development ensures continued innovation in scaling capabilities. 

Building infrastructure at scale requires the right tools and patterns for your organization’s specific needs. OpenTofu provides the foundation, but your scaling strategy determines long-term success. 

Remember that scaling isn’t just about technology; it’s about enabling your team to deliver infrastructure reliably, securely, and efficiently as your organization grows.

If you want to learn more about how Spacelift can help you manage OpenTofu at scale, book a demo with one of our engineers.

Solve your infrastructure challenges

Spacelift is a flexible orchestration solution for IaC development. It delivers enhanced collaboration, automation, and controls to simplify and accelerate the provisioning of cloud-based infrastructures.

Learn more

The Practitioner’s Guide to Scaling Infrastructure as Code

Transform your IaC management to scale

securely, efficiently, and productively

into the future.

ebook global banner
Share your data and download the guide