The archive_file
data source in Terraform creates and manages archive files dynamically within infrastructure as code. It automates packaging files into ZIP or TAR formats, making it easier to integrate with deployments, configuration management, and cloud storage solutions.
In this article, we will show you how to use the archive_file
to create archives from single or multiple files and address common issues related to this Terraform data source.
The archive_file
data source in Terraform creates compressed archive files (e.g., .zip
, .tar
) from local files or directories. It is commonly used to package application code, configuration files, or other assets for deployment.
archive_file
is particularly useful in automated deployment workflows where Terraform needs to bundle files before uploading them to cloud storage or deployment services (e.g., AWS S3, Lambda, or Azure Storage).
Here’s a basic syntax for the archive_file
block:
data "archive_file" "example" {
type = "zip" # Archive format: zip, tar, tgz
source_dir = "path/to/source" # Directory to archive
output_path = "path/to/output.zip" # Output archive path
}
Or, if you need to archive a single file:
data "archive_file" "example_file" {
type = "zip"
source_file = "path/to/file.txt"
output_path = "path/to/output.zip"
}
Where:
type
– Specifies the archive format (zip
,tar
, ortgz
)source_dir
– Defines the directory whose contents should be archivedsource_file
– Specifies a single file to be archivedoutput_path
– Determines the location where the archive file will be created
Note: You can use either source_dir
or source_file
, but not both in a single archive_file
data block.
The archive_file
data source supports different compression formats and allows you to specify input files, output paths, and file types.
The archive_file
data source can be useful for:
- Deploying function code – AWS Lambda deployments often require function code and dependencies to be zipped before uploading.
- Packaging configuration files – This includes Kubernetes manifests, Helm charts, or other configuration files that need to be archived before deployment.
- Bundling multiple files – This is useful when packaging multiple files into a single archive before uploading to cloud storage services like AWS S3, Google Cloud Storage, or Azure Blob Storage.
Learn more: How to utilize Terraform data sources
Here’s how to use the Terraform archive_file
to create a ZIP archive from a single file:
data "archive_file" "example" {
type = "zip"
source_file = "example.txt"
output_path = "example.zip"
}
output "archive_checksum" {
value = data.archive_file.example.output_base64sha256
}
The output_base64sha256
attribute provides a SHA-256 checksum of the generated archive in Base64 encoding. This can be used to verify file integrity and detect changes in Terraform runs. However, Terraform does not automatically track changes inside the source file — to force updates, consider using filemd5()
on source_file
.
Example: Creating a ZIP archive for an AWS Lambda function
For this example, let’s assume you have a Python script (lambda_function.py
) you want to compress and deploy as an AWS Lambda function.
Start by defining the archive_file
data source:
data "archive_file" "lambda_zip" {
type = "zip"
source_file = "lambda_function.py"
output_path = "lambda_function.zip"
}
Now, use the archive in an AWS Lambda deployment:
resource "aws_lambda_function" "my_lambda" {
function_name = "my_lambda_function"
role = aws_iam_role.lambda_role.arn
runtime = "python3.8"
handler = "lambda_function.lambda_handler"
filename = data.archive_file.lambda_zip.output_path
source_code_hash = data.archive_file.lambda_zip.output_base64sha256
}
Note: This example may not work if AWS Lambda expects a package with dependencies. If your Lambda function imports external libraries, you must zip the entire directory, not just the script.
Terraform’s archive_file
does not support specifying multiple individual files directly. If you want to create an archive from multiple specific files (but not an entire directory) using the archive_file
data source in Terraform, you cannot specify multiple source_file
entries in a single block.
Because archive_file
only supports either source_file
(for a single file) or source_dir
(for an entire directory), you need to first gather all the files into a temporary directory. This can be done manually or by using a terraform_data
with a local-exec
script to automate the process.
Here is a workaround using terraform_data
:
resource "terraform_data" "prepare_files" {
provisioner "local-exec" {
command = <<EOT
mkdir -p temp_folder
cp ${path.module}/file1.txt temp_folder/
cp ${path.module}/file2.txt temp_folder/
cp ${path.module}/file3.txt temp_folder/
EOT
}
}
data "archive_file" "multiple_files" {
type = "zip"
source_dir = "${path.module}/temp_folder"
output_path = "${path.module}/multiple_files.zip"
depends_on = [terraform_data.prepare_files]
}
To archive an entire directory, use source_dir
:
data "archive_file" "example" {
type = "zip"
source_dir = "${path.module}/my_folder"
output_path = "${path.module}/example.zip"
}
Example: Archiving a directory and uploading it to Azure Storage
In this example, we will create a ZIP archive of a directory and upload it to an Azure Storage Blob.
The Terraform configuration below compresses all files within a directory (my_app_folder
) into a ZIP archive:
data "archive_file" "app_package" {
type = "zip"
source_dir = "${path.module}/my_app_folder"
output_path = "${path.module}/app_package.zip"
}
Now, we’ll create an Azure Storage Account and Storage Container:
resource "azurerm_storage_account" "example" {
name = "mystorageacct"
resource_group_name = "my-resource-group"
location = "East US"
account_tier = "Standard"
account_replication_type = "LRS"
}
resource "azurerm_storage_container" "example" {
name = "mycontainer"
storage_account_name = azurerm_storage_account.example.name
container_access_type = "private"
}
To upload the archive:
resource "azurerm_storage_blob" "example" {
name = "app_package.zip"
storage_account_name = azurerm_storage_account.example.name
storage_container_name = azurerm_storage_container.example.name
type = "Block"
source = data.archive_file.app_package.output_path
depends_on = [data.archive_file.app_package]
}
Here are some examples of troubleshooting common issues with the archive_file
data source in Terraform.
1. Incorrect source path
One of the most common issues when using the archive_file
data source is specifying an incorrect file or directory path, which leads to errors like:
Error: failed to read source file: no such file or directory
To fix this:
- Verify the file or directory exists before running Terraform.
- Use an absolute path if necessary.
- If using relative paths, ensure they are correct by referencing
${path.module}
. - Check file permissions to ensure Terraform can read the file.
- Run
terraform plan
to confirm Terraform correctly identifies the file.
2. Missing required dependencies
Terraform relies on the zip
utility to create compressed archives. If the zip binary is not installed or not available in the system’s PATH, Terraform will fail.
If missing, install it via:
- Linux:
sudo apt install zip
(Debian/Ubuntu) - macOS:
brew install zip
- Windows: Ensure
zip.exe
is in the systemPATH
.
3. Unchanged archive not triggering updates
If files inside source_dir
change, Terraform may not detect updates in some cases. Use filemd5
to track file modifications:
output "archive_hash" {
value = filemd5(data.archive_file.example.output_path)
}
4. Incorrect file permissions
Permission errors may arise if Terraform creates an archive, but deployment tools (e.g., AWS Lambda, Docker containers, or remote servers) fail due to missing execution permissions. This commonly occurs when archiving shell scripts (.sh
), executables, or other files that require specific permissions to run in a deployment environment.
When trying to deploy an AWS Lambda function or an executable script, you might encounter an error such as:
Error: permission denied
or
/bin/sh: ./script.sh: Permission denied
This happens because Terraform archives files with their existing permissions, and some deployment tools require explicit execution rights.
To avoid permission issues, modify the file permissions before Terraform creates the archive:
chmod +x my_script.sh
5. Incorrect path resolution inside a module
When using Terraform modules, file paths are resolved relative to the module’s directory, not the root module. This can cause issues if the archive_file
data source references a path incorrectly.
Error: failed to read source file: no such file or directory
This typically happens when source_dir
or source_file
references files assuming a root module path instead of the module’s own directory.
To fix this issue, always use path.module
instead of path.root
to ensure Terraform correctly resolves paths relative to the module.
The correct way to define paths inside a module:
data "archive_file" "lambda_package" {
type = "zip"
source_dir = "${path.module}/app" # Ensures correct relative path
output_path = "${path.module}/lambda.zip"
}
The archive_file
data source in Terraform helps create ZIP and TAR archives from single or multiple files. In this guide, we covered syntax, usage, and troubleshooting issues such as incorrect paths, permissions, and missing files.
We encourage you to explore how Spacelift makes it easy to work with Terraform. If you need help managing your Terraform infrastructure, building more complex workflows based on Terraform, and managing AWS credentials per run, instead of using a static pair on your local machine, Spacelift is a fantastic tool for this.
To learn more about Spacelift, create a free account today or book a demo with one of our engineers.
Note: New versions of Terraform are placed under the BUSL license, but everything created before version 1.5.x stays open-source. OpenTofu is an open-source version of Terraform that expands on Terraform’s existing concepts and offerings. It is a viable alternative to HashiCorp’s Terraform, being forked from Terraform version 1.5.6.
Manage Terraform better and faster
If you are struggling with Terraform automation and management, check out Spacelift. It helps you manage Terraform state, build more complex workflows, and adds several must-have capabilities for end-to-end infrastructure management.