Open Policy Agent is an open-source engine that provides a way of declaratively writing policies as code and then using those policies as part of a decision-making process. It uses a policy language called Rego, allowing you to write policies for different services using the same language.
OPA can be used for a number of purposes, including:
- Authorization of REST API endpoints.
- Allowing or denying Terraform changes based on compliance or safety rules.
- Integrating custom authorization logic into applications.
- Implementing Kubernetes Admission Controllers to validate API requests.
OPA was originally created by Styra, and is now part of the Cloud Native Computing Foundation (CNCF), alongside other CNCF technologies like Kubernetes and Prometheus.
In this article, I will give an overview of Open Policy Agent, explain why you would want to use it, as well as showcase how you can use OPA with your Spacelift account. Although OPA can serve many purposes, I’m going to focus on how it can be used alongside Infrastructure as Code.
OPA accepts a policy, input and query, and based on that, generates a response. The input can be any valid JSON document, allowing OPA to integrate with any tool that produces JSON output.
We can visualize how OPA works using the following diagram:
In the following sections, I’ll go into more specific examples of using OPA that should help make things clearer but before I do, here’s a quick list of reasons why I’m interested in using OPA:
- Policies as code allow you to follow your standard development lifecycle with PRs, CI, etc., and provide you with a history of changes to your policies.
- OPA is designed to work with any kind of JSON input, meaning it can easily integrate with any tool that produces JSON output.
- Because OPA integrates with a number of different tools, it allows you to use a standard policy language across many parts of your system, rather than relying on multiple vendor-specific technologies.
- OPA supports unit-testing, making it easier and faster to iterate your policies with confidence that they won’t break.
Let’s try to make that a bit less theoretical by using a specific example: Terraform. Terraform can produce a plan in JSON format via the terraform show command. This means that we can define policies for our infrastructure, and use OPA to make a decision about whether a plan is safe to apply or not:
For example, say we have the following Terraform definition to create an EC2 instance:
provider "aws" {
region = "eu-central-1"
}
resource "aws_instance" "web" {
ami = "ami-00003c1d"
instance_type = "t3.micro"
}
Now, say we want to ensure that every Terraform resource has a Name tag. We could enforce that by creating a file called plan.rego with the following content:
package spacelift
allow {
resource_change := input.resource_changes[_]
resource_change.change.after.tags["Name"]
}
In order to use OPA to evaluate our policy, we need to take the following steps:
- Generate a Terraform plan as JSON.
- Run opa eval to verify whether that plan passes our policy or not.
Step 1 – Generate our Terraform plan as JSON
To get a JSON representation of our plan, we need to output our plan to a file, and then use the terraform show command to output that plan as JSON:
terraform plan -out spacelift.plan
terraform show -json spacelift.plan > spacelift.json
Step 2 – Run opa eval
We can then use opa eval to evaluate our plan against our policy:
$ opa eval --data plan.rego --input spacelift.json "data.spacelift.allow"
{}
As you can see, we’re using the query data.spacelift.allow because our policy is loaded as a data file, and we defined our allow rule in the spacelift namespace. You can also see that opa eval produced an empty output ({}). This means that our allow rule didn’t evaluate to true, and so produced no output.
Let’s adjust our Terraform definition to include a Name tag:
resource "aws_instance" "web" {
ami = "ami-00003c1d"
instance_type = "t3.micro"
tags = {
Name = "my-instance"
}
}
Now if we generate our plan and evaluate the policy again, we should get a slightly different output:
$ terraform plan -out spacelift.plan && \
terraform show -json spacelift.plan > spacelift.json
... lots of Terraform output
$ opa eval --data plan.rego --input spacelift.json "data.spacelift.allow"
{
"result": [
{
"expressions": [
{
"value": true,
"text": "data.spacelift.allow",
"location": {
"row": 1,
"col": 1
}
}
]
}
]
}
This tells us that the allow rule has evaluated true. We can get that in a slightly more concise way by using the pretty format option:
$ opa eval --data plan.rego --input spacelift.json --format pretty "data.spacelift.allow"
true
At this stage, you can probably imagine how you could integrate this into your CI/CD pipeline to enforce naming schemes, security rules, and various other organizational policies.
Note: New versions of Terraform will be placed under the BUSL license, but everything created before version 1.5.x stays open-source. OpenTofu is an open-source version of Terraform that will expand on Terraform’s existing concepts and offerings. It is a viable alternative to HashiCorp’s Terraform, being forked from Terraform version 1.5.6. OpenTofu retained all the features and functionalities that had made Terraform popular among developers while also introducing improvements and enhancements. OpenTofu is the future of the Terraform ecosystem, and having a truly open-source project to support all your IaC needs is the main priority.
The following examples illustrate some possible use-cases for OPA with Terraform. All of the examples have been taken from the Spacelift plan policy documentation, but have been altered to work with plain vanilla Terraform.
Example 1. Require human review when resources are deleted or updated
Adding new resources is usually a fairly safe operation, but updating or deleting existing resources can carry more risk. You could flag these for human review by using the following policy:
package spacelift
warn[sprintf(message, [action, resource.address])] {
message := "action '%s' requires human review (%s)"
review := {"update", "delete"}
resource := input.resource_changes[_]
action := resource.change.actions[_]
review[action]
}
Example 2. Require commits to be reasonably sized
Once a PR goes over a certain size it becomes difficult to review without missing things. The same is true for Terraform plans. The following policy can be used to warn when the number of changes goes over a certain threshold:
package spacelift
warn[msg] {
msg := too_many_changes[_]
}
too_many_changes[msg] {
threshold := 50
res := input.resource_changes
ret := count([r | r := res[_]; r.change.actions != ["no-op"]])
msg := sprintf("more than %d changes (%d)", [threshold, ret])
ret > threshold
}
Example 3. Blast radius
The following policy attempts to determine the risk of a particular plan by assigning different weightings to the change type (create, update, delete), along with the affected resource type (ECS cluster, EC2 instance, etc.). It takes the approach that an update or delete is more risky than a create because it affects an existing resource:
package spacelift
warn[msg] { msg := blast_radius_too_high[_] }
blast_radius_too_high[sprintf("change blast radius too high (%d/100)", [blast_radius])] {
blast_radius := sum([blast |
resource := input.resource_changes[_];
blast := blast_radius_for_resource(resource)])
blast_radius > 100
}
blast_radius_for_resource(resource) = ret {
blasts_radii_by_action := { "delete": 10, "update": 5, "create": 1, "no-op": 0 }
ret := sum([value | action := resource.change.actions[_]
action_impact := blasts_radii_by_action[action]
type_impact := blast_radius_for_type(resource.type)
value := action_impact * type_impact])
}
# Let's give some types of resources special blast multipliers.
blasts_radii_by_type := { "aws_ecs_cluster": 20, "aws_ecs_user": 10, "aws_ecs_role": 5 }
# By default, blast radius has a value of 1.
blast_radius_for_type(type) = 1 {
not blasts_radii_by_type[type]
}
blast_radius_for_type(type) = ret {
blasts_radii_by_type[type] = ret
}
How can we make sure that our policies work as we expect them to and that they don’t break over time as we make changes to them? You guessed it: unit testing! Luckily for us, OPA has first-class support for testing via the opa test
command.
To create tests for our policy, all we need to do is create another Rego file with a series of rules prefixed with test_
. Each rule starting with that prefix defines a separate test.
Let’s go ahead and create a file called plan_test.rego, with the following contents:
package spacelift
test_allow_missing_name_tag {
not allow with input as {
"resource_changes": [
{
"change": {
"after": {
"tags": null,
},
}
}
]
}
}
As you can see, OPA makes it really easy to specify the policy input using the <variable>
as <value>
syntax. This allows us to create very concise tests by only including values we care about in the policy input, rather than having to use the entire plan output.
Let’s go ahead and run opa test
:
$ opa test .
data.spacelift.test_allow_missing_name_tag: PASS (188.503µs)
--------------------------------------------------------------------------------
PASS: 1/1
resource_change.change.after.tags["Name"]
resource_change.change.after.tags["Environment"]
}
Unsurprisingly our test passes. That’s not very exciting, so let’s add a new test to ensure our resources include an Environment
tag:
test_allow_missing_environment_tag {
not allow with input as {
"resource_changes": [
{
"change": {
"after": {
"tags": { "Name": "my-instance" },
},
}
}
]
}
}
Running opa test again shows us a failure:
$ opa test .
data.spacelift.test_allow_missing_environment_tag: FAIL (118.078µs)
--------------------------------------------------------------------------------
PASS: 1/2
FAIL: 1/2
We can fix this by updating our policy:
allow {
resource_change := input.resource_changes[_]
resource_change.change.after.tags["Name"]
resource_change.change.after.tags["Environment"]
}
At this stage if you run the test command again it should show 2 passes:
$ opa test .
PASS: 2/2
If you want to know more about OPA testing, the official docs are full of great examples and information about what you can do.
Read more about writing policies in Rego: OPA Rego Examples & Tutorial (Introduction to Open Policy Agent Rego Language)
At this stage, hopefully, you’ve got a pretty good idea of what OPA is as well as how it can be useful to you. It’s not too difficult to see how you could start integrating OPA into your development process or even use it as part of production systems.
Luckily for you, at Spacelift we’ve done all the heavy lifting, allowing you to reap the benefits of using OPA for Policy-as-Code without you having to implement everything from scratch yourself. In this section, I want to showcase some of the functionality that Spacelift provides as it relates to OPA.
1) Policy Types
Spacelift allows you to use OPA policies to manage various aspects of your Spacelift account, and not just during planning. For example, you can use policies to control who can log into your account, along with what they have access to. For more information, see Policy as Code.
2) PR Checks
Spacelift can automatically trigger planning runs whenever you push changes to your VCS provider. For example, here’s what you might see in GitHub after creating a PR:
If all goes well your check will succeed, but if a policy is violated, a failed check will be reported. In our case, everything went smoothly:
You have the option to see everything that will be added/changed/destroyed directly from your VCS provider, or you can go to your Spacelift account and see everything in detail:
Now, let’s attach a policy that verifies if our instance has a t3.micro shape. Here is the content of the policy:
package spacelift
deny[sprintf(message, [instance, verified_type, instanceType])] {
verified_type := "t3.micro"
message := "Instance %s doesn't have a %s shape (%s)"
resource := input.pulumi.steps[_]
instanceType := resource.new.inputs.instanceType
instance := resource.new.urn
instanceType != verified_type
}
sample := true
Now, if we retrigger a run, we can easily see that it will fail, because our instance has a t2.micro shape:
3) Manual Approvals
Spacelift plan policies use a slightly different format from the one we used in our example policies earlier in this post. Not only do they allow you to specify a message to be displayed, but also have the concept of deny
and warn
:
package spacelift
deny["you shall not pass"] {
true
}
warn["hey, you look suspicious"] {
true
}
The deny
rule fails the run completely, while the warn
rule just displays a warning in the logs.
warn
rules take on another role when a run is going to deploy changes (vs just showing the planned changes against a PR). If any warnings are reported during a deployment, the run will wait for manual approval before applying any changes.
Let’s use the following example policy, designed to warn if an EC2 instance doesn’t have a shape we provide. To do this, we will just do a small change to the above policy (change deny for warn):
package spacelift
warn[sprintf(message, [instance, verified_type, instanceType])] {
verified_type := "t3.micro"
message := "Instance %s doesn't have a %s shape (%s)"
resource := input.pulumi.steps[_]
instanceType := resource.new.inputs.instanceType
instance := resource.new.urn
instanceType != verified_type
}
sample := true
If we then attempt to add resources that don’t have that particular instance type, Spacelift will block before applying the changes and give us the chance to manually approve or deny the run:
This allows you to build complex workflows where certain changes are completely blocked, while others are allowed as long as the changes are reviewed first.
Check out our policy library (available inside your Spacelift account → Policies → Account Templates) for more ideas about what policy templates are available, import them, and use them inside your workflow.
Spacelift provides a Terraform provider for managing your Spacelift account. This means that you can manage the policies available within your account, along with the Stacks they apply to in code.
For example, to add the policy we defined earlier to Spacelift we can use the spacelift_policy
resource like this:
resource "spacelift_policy" "plan" {
type = "PLAN"
name = "Plan Policy"
body = file("${path.module}/plan.rego")
}
We can then attach this policy to a Spacelift Stack using the spacelift_policy_attachment
resource:
resource "spacelift_policy_attachment" "mystack-plan" {
policy_id = spacelift_policy.plan.id
stack_id = spacelift_stack.mystack.id
}
Taking this a step further, we could create a custom module for defining Spacelift Stacks that ensured that all Stacks had a certain set of policies attached by default.
There are a number of exciting tools available that can help integrate OPA with your systems, for example Styra DAS. Styra provides a way of writing policies and deploying them across your infrastructure, as well as providing tools for unit testing, monitoring policy usage and more. You can try OPA at scale for free using Styra DAS Free.
I hope you’ve enjoyed this post, and can see the value that OPA brings to the table. If you’re interested in trying out Spacelift to see what we have to offer, why not sign up for a free trial? You can set up a Spacelift account in minutes, and get started on your Policy as Code journey today!
Automation and Collaboration Layer for Infrastructure as Code
Spacelift is a flexible orchestration solution for IaC development. It delivers enhanced collaboration, automation and controls to simplify and accelerate the provisioning of cloud based infrastructures.