General

Introduction to Open Policy Agent (OPA) Rego Language

Open Policy Agent Rego

Policy as Code has been a very hot topic recently. It allows you to codify your rules and decision-making to execute them in an automated way. This lets products expose a programmatic, composable language for policies instead of having to build very complex purpose-specific UIs with all options available. One of the most – if not the most – popular Policy as Code engines is Open Policy Agent, used in many projects, like Kubernetes and Envoy, and also extensively used in Spacelift. It’s open source and uses the Rego language for policy authoring.

Rego, however, is a language that works very differently than most and can be quite unintuitive at first glance. It’s actually more similar to SQL than to common imperative languages like Python. This means that the learning curve can be quite steep. Moreover, copy-paste development will very often not help you understand Rego – and authoring complicated policies – better.

This is precisely why I wrote this article. It’s meant to guide you through some of the fundamental constructs and mechanisms of Rego, especially those that we’ve seen used a lot in the wild so that you can get a better intuitive understanding of how it all fits together and how to author larger, more advanced, policies. This is not aiming to be a complete (or even sizeable) reference, nor is it a Spacelift-specific guide. If you want to go along and play with the examples, the Open Policy Agent playground is the best place to do so.

Decisions

When talking about Rego, I think it’s best to start with decisions. Decisions in Rego are used as the output of policies but also as temporary variables all over the place. They don’t have to be true/false either – which is one of the common misconceptions! Decisions can be strings, arrays, objects, sets, etc. You can have as many decisions in your policy as you want.

For example, you can have a simple boolean decision with a constant,

allow = true

or reference a different variable.

allow = accessible_by_admin and is_admin

So basically normal variable (decision) assignment.

Rego also allows you to use block notation for assignments:

allow {
    accessible_by_admin := startswith(path, “/admin/”)
    accessible_by_admin
    is_admin
}

where each line can be an assignment or true/false expression. The result of the whole block will be true only if the expression in each line evaluates to true, letting you build complex rules that use many other decisions and variables inside of them. We can check a bunch of predicates, and only if all checks succeed does the decision come out to be true.

You can also have a decision that’s an OR of two other decisions, like here:

is_admin {
    input.user.admin
}

is_endpoint_public {
    startswith(path, “/public/”)
}

allow {
    is_admin
}
allow {
    is_endpoint_public
}

As you can see, a block for a decision can be repeated multiple times, and the decision will be true if any of the blocks evaluate to true.

Some

The above block is very simple and linear, but we can also do something slightly more complicated. Let’s say we have a request path and an array allowed_path_prefixes and want to check if the path matches any prefix. In that case, we can specify an additional variable for the index:

allow {
    some i
    startswith(request.path, allowed_path_prefixes[i])
}

Now, the way this will intuitively work is that rego will choose an i, move to the second line, and if it fails, go back, try another i and so on until it finds an i that leads to all lines being true. If no such i is found, then allow will be null (technically speaking, it will be undefined).

In human terms, you can read this as “set allow to true if, for some value of i, the request path starts with the i’th allowed path prefix,” or, in even more human terms, “set allow to true if the request path starts with any of the allowed path prefixes.”

If you have more such variables, you can add them separated by commas next to the i:

some i, j, k, l, m, n, o, p

Additionally, if this i is only needed in a single place, like in the above example, you can substitute it with an underscore instead:

allow {
    startswith(request.path, allowed_path_prefixes[_])
}

We can also take a look at an example that’s closer to Spacelift, where we might want to decide if a commit is worth a Terraform execution or whether it should just be ignored. We get a list of files changed by the commit, and we also have a couple of paths that are of interest to Terraform. Here we’ll check if any of the paths changed starts with one of the interesting paths:

interesting_path_prefixes := [
  “/src/terraform”,
  “/src/modules”
]
track {
    startswith(input.affected_files[_], interesting_path_prefixes[_])
}

We’re using two underscores in a single line to mean “is there any pair of (affected file, interesting path prefix) such that the file starts with the prefix.”

But wait, there’s no allow rule here? Is that a valid policy? Yes, it is! Policies can have arbitrary sets of decisions, with those decision being of arbitrary types. It’s just that allow based policies are one of the most obvious use cases, but the power of the Rego language extends much further and can be used for all kinds of decisions, as exemplified by the rich selection of policies available in Spacelift.

Sets

Decisions can also be specified as sets:

allowed_users := [“papaya”, “potato”]
allow[“papaya”] {
    “papaya” == allowed_users[_]
}

which means “Papaya should be in the allow set if it belongs to the allowed_users list.”

Moreover, with this block notation, you can specify the element as a variable reference from the block itself:

allow[user] {
    user := input.user
    user == allowed_users[_]
}

Using just a single block, you can even specify multiple elements of the set, by having multiple evaluation paths that successfully reach the end of the block and each path having a different user variable.

allow[user] {
    user := input.users[_]
    user == allowed_users[_]
}

The value in the square brackets can actually be an arbitrary expression referencing the block’s variables, so if we have a policy whose decisions are warnings based on resources changed, we could do the following:

forbidden := {“expensive_resource”, “expensive_resource2”}
warn[sprintf(“You shall not use %s”, resource_name)] {
    resource_name := input.resources_changed[_].name
    forbidden[resource_name]
}

which checks for forbidden resources and displays a pretty warning message if one of them is changed. In this example, you can also see the usage of a set containment check in the last line of the block.

Functions

A more advanced feature of rego is that it lets you define custom functions you can use as helpers in your policy. Writing functions is very similar to writing block decisions, but with some minor differences:

plus_custom(a, b) := c {
    c := a + b
}
out := plus_custom(42, 43)

You can see we’re specifying a list of arguments, the variable that should be used as the output, and then have a normal block body. The output of the function will be c as long as the function successfully reaches the end of its body.

However, instead of that output variable, we could also, again, have an arbitrary expression, a constant, for instance:

bucket_is_secure(bucket) := true {
    not bucket.public
    bucket.encrypted
}
bucket := {"public": false, "encrypted": false}
out := bucket_is_secure(bucket)

In this case, the function will evaluate to true if the bucket is not public and is encrypted.

Summary

All of this has just been a quick overview of the parts of Rego we’ve seen most commonly used. With these building blocks, you should be much better equipped to begin authoring your own policies, whichever project you’re using them with. If you want to learn more, the whole Rego policy language is much bigger, and the best place to dive in is the Official Open Policy Agent Policy Language documentation.

If you think Rego is cool and would like to play with a product where it’s put front-and-center, we’d love you to take Spacelift for a spin!

Automation and Collaboration Layer for Infrastructure as Code

Spacelift is a flexible orchestration solution for IaC development. It delivers enhanced collaboration, automation and controls to simplify and accelerate the provisioning of cloud based infrastructures.

Start free trial