Infra chaos crushing your controls?

Meet Spacelift at AWS re:Invent

Ansible

50+ Top Ansible Interview Questions & Answers for 2025

ansible interview questions

You can never be overprepared for an interview, especially when it comes to technical roles.

As a popular automation tool, Ansible is often a key topic in DevOps and IT infrastructure interviews. To help you get ready, we’ve compiled a list of the top 55 Ansible interview questions along with their answers. This list cuts across various difficulty levels, from beginner to advanced, ensuring you’re well-prepared for any question that comes your way.

Even if you know the answers to these questions, reviewing them can help reinforce your knowledge and boost your confidence.

We will start with the basics and progress to more complex topics as follows:

  1. Ansible foundations & ecosystem
  2. Inventory, connectivity & configuration
  3. Playbook design
  4. Orchestration & execution control
  5. Performance & observability
  6. Security & secrets
  7. Integrations, governance & delivery
  8. Event-Driven Ansible (EDA)

Ansible foundations & ecosystem

The questions in this section cover the fundamental concepts of Ansible, its architecture, and its ecosystem.

1. What is Ansible, and what are its primary use cases?

TL;DR: Ansible is an open-source automation tool used for configuration management, application deployment, and task automation. It is agentless and uses SSH for communication.

Interviewers often start with this question to gauge your basic understanding of Ansible. Here’s a concise answer to form the foundation of your response:

Ansible is an open-source software tool that helps you manage computers and applications without needing to log into each system manually. It is designed to be simple to use, with a focus on ease of learning and minimal setup.

It operates in an agentless manner, meaning it does not require any software to be installed on the target machines; instead, it uses SSH for communication. This makes it particularly useful for managing both Linux and Windows systems.

At its core, Ansible is about consistency and repeatability. Regardless of whether you’re setting up a new server, deploying an application, or patching hundreds of machines, Ansible ensures the process remains the same every single time.

Ansible playbooks are written in YAML, a human-readable and easy-to-understand language, allowing users to define their automation tasks in a clear and structured manner.

As for its primary use cases, you can mention:

  • Configuration management: Keeping servers in a known state (e.g., making sure Nginx is installed and running everywhere)
  • Application deployment: Rolling out apps across environments without manual steps
  • Orchestration: Coordinating complex workflows like zero-downtime updates or multi-tier app rollouts
  • Provisioning: Bootstrapping new infrastructure, often in the cloud

Note: Ansible is open-source, but it also has a commercial version called Ansible Tower (now part of Red Hat Ansible Automation Platform), which provides additional features like a web-based interface, role-based access control, and scheduling.

2. How does Ansible’s approach differ from other configuration management tools?

In this question, the interviewer aims to assess your understanding of what sets Ansible apart from other tools, like Puppet, Chef, or SaltStack. All of them solve the same problem, but they take very different paths to get there.

Here’s a concise answer:

The key difference is that Ansible is agentless. This means you don’t need to install a background service (an “agent”) on every machine you want to manage. Instead, Ansible connects over SSH (for Linux) or WinRM (for Windows) to run tasks directly. This lowers the setup overhead and makes Ansible especially friendly for teams just getting started.

Another difference is simple language. Whereas tools like Chef and Puppet use Ruby-based DSLs (domain-specific languages), Ansible uses YAML format, which is relatively easy to pick up even if you’re not a developer.

Other distinguishing features include declarative configuration, extensibility and modularity, and a strong focus on idempotency.

3. What is Ansible Galaxy, and how does it relate to Collections?

Ansible Galaxy is a community hub for sharing Ansible roles. It allows users to find, download, and share reusable content that can accelerate the development of Ansible playbooks.

Instead of starting from scratch every time you need to automate something, you can find reusable building blocks created by the community or vendors.

Originally, Galaxy was mainly a place to share roles (structured sets of tasks, variables, and handlers for a specific use case). For example, you might find a role for installing MySQL or configuring Nginx.

But as Ansible grew, the ecosystem expanded beyond just roles. That’s where Collections came in.

A Collection is a distribution format introduced in Ansible 2.9 to package and distribute not only roles, but also:

  • Modules (the building blocks of Ansible tasks)
  • Plugins (extend Ansible’s functionality)
  • Playbooks and documentation

This makes Collections much more powerful and organized than roles alone.

Galaxy is the place where you publish, browse, and install content, whereas Collections are the format in which that content is packaged and distributed.

For example, you might install the official community.general Collection from Galaxy to get access to a wide range of community-supported modules and plugins.

4. What are Collections and FQCNs, and how do you pin versions/sign content in requirements.yml?

Collections are a way to package and distribute Ansible content, including roles, modules, and plugins.

FQCN is an abbreviation for Fully Qualified Collection Name. It’s a way to uniquely identify a module or plugin by specifying the collection it belongs to, along with the module name.

The format is:

namespace.collection.module_name

For example, community.general.git refers to the git module from the community.general Collection.

When you depend on Collections, you usually define them in a requirements.yml file. This ensures everyone on your team installs the same version, making automation more predictable

Now, to pin versions of Collections in this file, you can specify the version number alongside the collection name.

Here is an example:

collections:
  - name: community.general
    version: 7.8.0
  - name: ansible.posix
    version: 1.3.0

From the example above:

  • community.general is pinned to exactly version 7.8.0.
  • ansible.posix accepts any version 1.3.0.

For added security, Ansible supports signing Collections, so you can trust the source. Publishers can sign their Collections with PGP keys, and consumers can verify signatures before installing.

You’d typically configure this in your ansible.cfg or through the ansible-galaxy CLI. For example:

ansible-galaxy collection verify my_namespace.my_collection 

5. Explain the Ansible variable precedence ladder and common pitfalls

Ansible has a well-defined variable precedence hierarchy that determines which variable value takes precedence when the same variable is defined in multiple places.

For a complete list of all the levels of precedence, refer to the official documentation. However, a simplified order looks like this:

  1. Role defaults have the lowest priority
  2. Inventory variables such as group_vars and host_vars
  3. Playbook vars such as vars, vars_files, and vars_prompt)
  4. Task-level vars, including set_fact
  5. Extra-vars (passed with -e) have the highest priority

The full precedence list is much longer (over 20 levels), but for most situations, it boils down to this principle: extra-vars always win, role defaults always lose.

Some common pitfalls to watch out for:

  • Unintended overrides: If you define a variable in multiple places, you can easily forget where the highest precedence definition originates, leading to unexpected behavior.
  • Using role defaults instead of vars: Role defaults are so weak that even inventory variables override them. Sometimes people put important values in defaults/main.yml and wonder why they get overwritten.
  • Forgetting about set_fact: The set_fact module creates host-level variables that persist throughout the entire play. If variable values seem to be “mysteriously” overridden, set_fact might be the cause.
  • Mixing host_vars and group_vars: Host-level variables always take precedence over group-level variables. If you define the same variable in both, only the host’s value is used.

Note: During an interview, if you’re asked this, don’t try to recite the entire precedence order (nobody expects that). Instead, explain the principle: “Ansible chooses the variable defined closest to execution. Role defaults are the weakest, and extra-vars are the strongest.”

6. What are Ansible facts and registered variables? How does fact gathering/caching work, and when should you disable it?

In Ansible, facts are pieces of information about the target system that Ansible automatically collects at the beginning of a playbook run. These facts include details like the operating system, IP addresses, memory, CPU, and more. They are gathered using the setup module and stored as variables that you can reference in your playbooks.

A registered variable is a way to capture the result of a task and use it later in your playbook.

For example:

- name: Check if a file exists
  ansible.builtin.stat:
    path: /etc/passwd
  register: file_info

- name: Print the result
  debug:
    msg: "File exists: {{ file_info.stat.exists }}"

From the playbook above, file_info holds all the output of the stat module, and you can use its values in conditionals or future tasks.

Because Ansible gathers facts at the start of every play, performance can be somewhat slow, especially in large environments.

You can control this with the gather_facts keyword:

- hosts: all
  gather_facts: no

If you disable it, Ansible won’t run the setup module unless you explicitly call it later.

Fact caching is a performance optimization. Instead of re-gathering facts every time, you can cache them in a backend (such as JSON files, Redis, or Memcached), so that facts persist between playbook runs, speeding things up.

You configure it in ansible.cfg, using something like this:

[defaults]
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
fact_caching_timeout = 7200

Reasons for disabling fact gathering include:

  • Speed: If your playbooks don’t need facts, disabling saves time.
  • Security: On sensitive systems, you might not want to expose all host details.
  • Control: Sometimes you only want specific facts, so you call the setup module with filters (e.g., setup: filter=ansible_distribution).

For a concise interview answer, you can say:

“Facts are system details that Ansible collects automatically, while registered variables capture task results at runtime. Since fact gathering can slow down large runs, you can disable it when not needed or use caching to speed things up.”

7. Push vs pull: when would you use ansible-pull module?

Ansible primarily operates in a push model, where the control node (the machine where you run Ansible) pushes configurations and tasks to the managed nodes over SSH or WinRM. This is the most common way to use Ansible and is suitable for most scenarios.

However, in the pull model, which is implemented using the ansible-pull command, the managed nodes pull their configuration from a central repository (like a Git repository) and apply it locally.

You might consider using ansible-pull in the following scenarios:

  • Decentralized environments: If you have a large number of nodes that are not always reachable from the control node, using ansible-pull allows each node to independently fetch and apply its configuration when it can connect to the repository.
  • Self-healing systems: In environments where nodes need to ensure they are always in a desired state, ansible-pull can be set up as a cron job or systemd timer to periodically pull the latest configuration and apply it.
  • Edge devices or IoT: For devices that may not have a stable connection to a central control node, ansible-pull allows them to manage their own configuration.
  • Simplified management: In some cases, using ansible-pull can simplify the management of configurations by allowing nodes to manage their own state without requiring a central control node to initiate changes.

8. Describe Ansible’s agentless architecture

Ansible operates using an agentless architecture, which means that it does not require any special software or agents to be installed on the target machines (managed nodes) that it controls. Instead, Ansible uses standard protocols such as SSH (for Unix/Linux systems) and WinRM (for Windows systems) to communicate with and manage these nodes.

For more details, refer to the official documentation.

9. What is an Ansible module, and how do you quickly look up its options?

A module in Ansible is a reusable, standalone script that performs a specific task on the managed nodes. Modules are the building blocks of Ansible playbooks and can handle a wide range of tasks, such as installing packages, managing files, configuring services, and more.

Because modules are at the heart of how Ansible works, it is important to be able to quickly check the options a module supports. The simplest way is by using the command line.

Running ansible-doc <module_name> shows you the module’s description, all the parameters it accepts, whether they’re required, and even examples of how to use them. For example, if you want to see how the copy module works, typing ansible-doc copy will pull up its full documentation right in your terminal.

You can also find module details on the Ansible documentation, which lists every module, grouped by collection. This is often more convenient if you prefer browsing or need to view related modules simultaneously.

10. What’s the difference between ad-hoc commands and playbooks?

Ad-hoc commands are one-off commands you run directly from the command line to perform quick tasks on your managed nodes. They are useful for simple operations, such as checking connectivity, gathering facts, or making minor changes without writing a full playbook.

For example, you might use an ad-hoc command to ping all your servers:

ansible all -m ping

Playbooks, on the other hand, are YAML files that define a series of tasks to be executed in a specific order. They allow you to automate complex workflows, manage configurations, and deploy applications in a repeatable and structured way. Playbooks can include multiple plays, roles, variables, and conditionals, making them much more powerful than ad-hoc commands.

Inventory, connectivity & configuration

The questions in this section focus on how Ansible connects to and manages the systems in your infrastructure.

11. What are inventories in Ansible? Explain static vs dynamic inventories

An inventory in Ansible is a file or script that defines the hosts and groups of hosts that Ansible will manage. It serves as a source of truth for the systems you want to automate, allowing you to organize them into logical groups for easier management.

Inventories can be either static or dynamic. A static inventory is a simple text file (usually in INI or YAML format) that explicitly lists the hosts and groups. 

On the other hand, a dynamic inventory is generated on-the-fly by a script or plugin that queries an external source (like a cloud provider, database, or configuration management system) to retrieve the list of hosts and their details.

12. How do you define inventories and group hosts (e.g., group_vars, host_vars)?

You can define inventories in a file; the simplest options are hosts.ini or inventory.yml. Inside, you can group machines under labels like [web] or [db], allowing tasks to target them specifically.

Beyond grouping, Ansible lets you attach variables to groups or individual hosts. Files placed under group_vars/ apply to all hosts in a group, whereas files in host_vars/ set variables for a single machine. This is a clean way to organize configurations when you’re dealing with different environments or roles.

13. How does Ansible connect to Linux vs. Windows hosts?

For Linux, Ansible uses SSH and relies on Python being present on the target machine. That’s why it feels so natural for Linux admins.

For Windows, it’s slightly different. Ansible uses WinRM, which is Microsoft’s protocol for remote management. The modules written for Windows are usually in PowerShell, but from your side, you still write tasks in YAML like usual. The connection method changes under the hood, but the Ansible experience stays consistent.

14. What does ansible.cfg control, and where can it live?

ansible.cfg is the central configuration file that controls how Ansible behaves. It can set defaults for inventory paths, SSH settings, privilege escalation, retries, and more.

An interesting aspect is that Ansible looks for ansible.cfg in multiple locations: the current directory, the ANSIBLE_CONFIG environment variable, the user’s home directory (~/.ansible.cfg), and finally in /etc/ansible/ansible.cfg. The first one it finds is the one it uses. This allows both project-specific configs and global defaults to coexist.

15. How do you test connectivity and basic setup?

The go-to command is ansible all -m ping. This doesn’t literally ping the server with ICMP. It runs the Ansible “ping” module, which just confirms that it can connect, authenticate, and run Python (or PowerShell for Windows). If that works, you know the basics of your setup are correct. 

16. What are Execution Environments and ansible-navigator, and how do Automation Controller/AWX compare to the CLI?

Execution Environments are container images that package Ansible with the necessary collections, plugins, and dependencies. 

ansible-navigator is the tool that helps you interact with these environments in a smoother way, with a cleaner interface for running playbooks, exploring collections, or debugging.

On the other side, Automation Controller (the commercial version) and AWX (the open-source upstream) provide a web UI, API, and scheduling system on top of the CLI. Instead of just running commands locally, you can centralize automation, manage credentials, and provide teams with a platform for using Ansible.

Playbook design

In this section, we will explore how to design effective playbooks, including structuring tasks, using variables, and implementing best practices. 

17. What is an Ansible playbook?

Written in YAML, a playbook is where you define what you want Ansible to do. It describes which machines to target, what roles they should play, and the exact steps to bring them into the desired state.

A playbook consists of one or more plays, and each play can include multiple tasks. Each task uses a module to perform a specific action, such as installing a package, copying a file, or starting a service.

18. What does “idempotence” mean in Ansible, and how do you ensure it in tasks?

Idempotence means that running the same playbook multiple times should always lead to the same outcome. If a file is already present or a package is already installed, Ansible will not reapply the changes. This is what makes Ansible safe and predictable.

To ensure idempotence, use modules in the way they’re designed. For example, instead of using a raw command to install a package, use the package module, which checks if the package exists already before acting.

19. What’s the difference between vars, vars_files, and vars_prompt?

All three are methods for defining variables in a playbook. vars lets you declare variables directly in the playbook itself. vars_files allows you to store variables in a separate YAML file and then reference it, keeping playbooks cleaner. vars_prompt asks the user for input when the playbook runs, which is handy for things like passwords or environment-specific values that you don’t want to be hardcoded.

20. When would you use a role vs. a simple task file?

A simple task file is fine when you only need a few steps and don’t expect to reuse them much. Roles are preferable when you want to organize things in a standard structure with tasks, variables, handlers, templates, and defaults all grouped together. 

Roles make your playbooks modular and reusable across projects, while task files are best for quick, lightweight automation.

21. What’s the difference between block, rescue, and always?

A block groups tasks together. If any task within the block fails, the rescue section runs, providing a way to handle errors gracefully. After that, the always section runs no matter what, similar to a “finally” in programming. This pattern is useful when you want to try something, have a fallback if it fails, and still perform cleanup in the end.

22. Describe how to use the legacy with_items loop versus loop in newer versions

with_items was the old way to loop over a list in Ansible. It still works, but it is considered legacy. The newer loop keyword is cleaner, more consistent, and supports advanced features like loop control, indexing, and filters. For example, instead of with_items: [1, 2, 3], you’d now write loop: [1, 2, 3].

23. How do handlers and notify work together? Include advanced patterns like listen and ordering

Handlers are special tasks that only run when triggered. A regular task can send a signal to a handler using notify. For example, if you change a configuration file, you may need to notify a handler to restart the service.

Advanced patterns include using listen, so that multiple tasks can trigger the same handler, and carefully controlling the order when handlers depend on each other. This makes automation efficient by ensuring services only restart when needed.

24. Name three common strategies to keep playbooks DRY

DRY stands for “Don’t Repeat Yourself,” and keeping playbooks DRY is crucial for maintainability. The following are three common strategies:

  1. Roles: Encapsulate related tasks, variables, and handlers into reusable roles that can be included in multiple playbooks.
  2. Includes and imports: Use include_tasks or import_tasks to reuse task files across different playbooks.
  3. Variables and defaults: Define common values in group_vars, host_vars, or role defaults to avoid hardcoding the same values multiple times.

25. When do you favor include_tasks over import_tasks?

import_tasks pulls tasks into the playbook at parse time, meaning Ansible is aware of them before execution begins. This is beneficial for tasks that are always needed. 

include_tasks works at runtime, so tasks are only brought in when that part of the play runs, which makes it more flexible for conditionals or loops.

26. How do you use changed_when and failed_when effectively?

These two let you control how Ansible interprets the result of a task. changed_when can tell Ansible that a task didn’t really make changes, even if the module thought it did. 

failed_when lets you mark a task as failed based on custom logic, even if the command technically succeeded. Both are useful for keeping reports accurate and making playbooks smarter about real-world conditions.

27. What are callback plugins, and when might you enable one?

Callback plugins enable you to change how Ansible displays output or sends results somewhere else. 

For example, you can use a callback plugin to log results to a file, send them to a chat app, or display cleaner summaries in the terminal. They’re helpful when you need better observability or want results integrated with external tools.

28. How do you template with Jinja2 (filters/tests like default, combine, to_nice_yaml)?

Templating in Ansible uses Jinja2, which lets you insert variables and apply filters within tasks or templates. Filters like default provide fallback values, combine merges dictionaries, and to_nice_yaml renders variables as readable YAML. This makes templates flexible and dynamic, so you can generate configuration files or commands that adapt to your variables and environments.

Orchestration & execution control

This section focuses on advanced execution control, orchestration techniques, and best practices for managing complex workflows.

29. How do you write advanced conditionals and loops in Ansible?

Conditionals let you control whether a task runs based on a variable or fact. For example, you can use when to check if a system is Ubuntu before installing a package.

Loops let you repeat a task for each item in a list. Advanced use occurs when you combine them, such as looping over users and only creating them if they don’t exist, or using filters inside the loop to transform the data. This keeps playbooks flexible and avoids hardcoding.

30. How do you control task execution across hosts?

By default, Ansible runs tasks on all targeted hosts in parallel. You can change this by using keywords like 'serial', which runs tasks in batches, or by using conditionals to skip certain tasks for specific hosts. 

You can also use delegate_to to run a task on a specific host even if it’s part of a larger group. These controls enable you to carefully orchestrate how and where tasks get applied.

31. How do you implement rolling updates with serial, and how does max_fail_percentage influence rollout safety?

Rolling updates are implemented with the serial keyword. Instead of updating all hosts at once, you can say serial: 2 to update two at a time. This distributes risk and avoids downtime. The max_fail_percentage setting makes things safer by stopping the rollout if too many hosts fail in a batch. 

Together, these settings strike a balance between speed and stability when updating production systems.

32. How do you run long-running tasks asynchronously using async and poll, and what trade-offs or best practices apply?

For tasks that take a long time, like waiting for a database migration, you can use the async parameter to let them run in the background. The poll value controls how often Ansible checks in for completion. Setting poll: 0 tells Ansible to fire and forget, while higher values mean it waits for results.

The trade-off is that asynchronous tasks are more difficult to track, so you need to think about logging and error handling. A best practice is to use them only when blocking execution would slow down everything else.

33. How do you design and use tags end-to-end to maintain tag hygiene in large playbooks?

Tags are labels you can attach to tasks or plays that let you run only the parts of a playbook you care about. For example, you might tag all database tasks as db and run only those when needed.

Good tag hygiene means using clear, consistent names, avoiding too many overlapping tags, and documenting them so your team knows how to use them. This makes playbooks more modular and easier to navigate in big projects.

34. How can you limit the execution scope of a playbook (–limit, patterns, groups)?

Sometimes you don’t want to run a playbook against every host. The --limit flag lets you restrict execution to a single host, a group, or even a pattern of hosts. 

Combined with groups in your inventory, this provides a quick way to narrow the scope, for example, running a fix only on staging before pushing it to production.

35. What are strategies (e.g., linear, free, host_pinned, debug) and when do you use each?

Strategies control how Ansible schedules tasks across hosts.

  • The default is linear, which runs tasks in order on all hosts.
  • free allows hosts to run tasks as quickly as possible without waiting for others, which speeds things up but can make output harder to follow.
  • host_pinned keeps each host tied to a single worker process, which is useful for debugging performance issues.
  • The debug strategy runs tasks interactively, letting you step through playbook execution one task at a time.

Choosing the right strategy depends on whether you want predictability, speed, or more control during troubleshooting.

Performance & observability

This section covers performance tuning, logging, and monitoring techniques to ensure efficient and reliable Ansible operations.

36. What logging and profiling options are available in Ansible?

Ansible logs can be enabled through ansible.cfg by setting a log path. The callback plugins are particularly useful for profiling. For example, profile_tasks displays the time each task takes, and timer gives the total runtime. These are useful when you want to spot bottlenecks in playbooks.

37. How can performance be optimized (forks, pipelining, SSH ControlPersist, fact caching, connection reuse)?

Here are a few options:

  • Forks: Increase parallelism by running tasks on more hosts at once.
  • Pipelining: This cuts down on SSH operations by reducing round-trip.
  • SSH ControlPersist: This option keeps SSH connections open, preventing Ansible from reconnecting every time.
  • Fact caching: This stores gathered facts so they don’t have to be fetched repeatedly.
  • Connection reuse: This keeps connections alive across tasks for faster runs.

Together, these tweaks make large-scale automation significantly faster and less resource-heavy.

38. What is diff_mode, and when is it helpful?

diff_mode displays the difference between the current state and the proposed changes. It’s especially helpful for config management (such as when updating a file or template) so you can see exactly what Ansible will change before it does. Think of it as a safe way to validate your playbook’s impact.

Security & secrets

This section focuses on security best practices, managing sensitive data, and ensuring safe operations in Ansible. 

39. How do you use privilege escalation safely?

Privilege escalation in Ansible refers to running specific tasks with elevated privileges, allowing them to modify system files or manage services. The safe way to do this is to avoid running an entire playbook as root and instead use become: yes at the task or play level only where needed, paired with become_method (usually sudo) and a controlled become_user (often root).

Keep privilege escalation scoped tightly: use least privilege for the shortest time, avoid embedding plaintext passwords in playbooks, and prefer using a secure vault or credential store for any escalation credentials. 

Also, enable and audit ansible.cfg settings that restrict privilege escalation behavior if your environment requires stricter controls.

40. How do you secure control node credentials, and when should you use no_log?

Your control node often holds the keys to everything, so protect it like a vault: minimize who can log into it, use an SSH agent rather than storing private keys in world-readable files, and keep credentials out of playbooks and source control.

For secrets that appear in task output (passwords, tokens), set no_log: true on the task so Ansible doesn’t record the sensitive values in logs or callback output. Use no_log sparingly and only for truly sensitive fields, because it also suppresses useful debugging information.

When you do need to record run metadata but redact secrets, use structured logging or callback plugins that support redaction instead of blanket no_log everywhere.

41. Using no_log and redaction for secrets in Ansible tasks and templates

no_log: true will hide a task’s input and output entirely from logs, which is useful for things like one-off credential generation or secret injections. If you want more control than “all or nothing,” prefer redaction via custom callback plugins or the built-in ansible.cfg log settings that support masking.

For templated files, never render secrets into templates that will be stored in plaintext on the control node. Instead, write templates that accept variables at runtime and pull secrets from a secure lookup or Vault at execution time.

Remember that no_log does not prevent secrets from being written to remote files by tasks; it only suppresses logging, so design playbooks so secrets are written in-memory or to secure destinations when possible.

42. How do you handle external secret lookups (Vault/SSM), vault IDs, and multi-vault workflows?

Use Ansible Vault or a dedicated secrets backend for real secrets. Ansible Vault allows you to encrypt files and variables, and decrypt them at runtime. You can encrypt a whole group_vars file or a single variable, and pass vault passwords via environment variables or a vault password file (but keep that file protected).

For cloud-managed secrets, use lookup plugins such as aws_ssm or hashi_vault so the playbook fetches secrets at runtime rather than storing them.

When you need multiple vaults, for example, one for CI and another for production, use Vault IDs. Vault IDs let you map encrypted files to different keys and supply matching passwords or password files at runtime, enabling multi-vault workflows without mixing keys.

In CI/CD, bake your execution environments or runner configurations to securely provide vault keys or IAM roles, so playbooks can fetch secrets without exposing vault credentials in logs or repos. (See: Using Ansible in CI/CD Workflows)

Integrations, governance & delivery

This section explores how Ansible integrates with other tools, manages governance, and fits into delivery pipelines. 

43. What is Ansible check_mode, and what are its caveats?

Check mode (--check) is Ansible’s dry run. It shows what would change without actually making any changes. It’s useful for validation, but not every module supports it, and some tasks may still behave differently than in a real run. Great for a preview, not a guarantee.

44. How can you handle version control for playbooks and roles?

Treat playbooks and roles like code. Keep them in Git, use branches for changes, and tag versions you want to reuse. Roles can even be published and versioned through Ansible Galaxy. This makes your automation reproducible and easy to roll back.

45. How do you integrate Ansible with CI/CD pipelines?

You plug Ansible into pipelines just like application code. For example, Jenkins, GitHub Actions, or GitLab CI can run playbooks after a commit. Pipelines handle testing, linting, and applying configs to staging before production. The idea is that infrastructure changes follow the same review-and-release process as app code.

46. What are dynamic inventory plugins for cloud providers (AWS, Azure, GCP)?

Instead of hardcoding hosts in a file, dynamic inventory queries the cloud directly. Ansible can pull current info from AWS, Azure, Google Cloud, or other sources, so your playbooks always target the right machines without manual updates.

47. How does Ansible interact with Kubernetes?

Ansible can communicate with Kubernetes clusters using modules and collections, such as kubernetes.core. You can deploy workloads, manage resources, or even bootstrap clusters. Think of it as bringing Ansible’s YAML-driven automation into the Kubernetes world.

48. Describe a project where you used Ansible for automation

For a concrete example, imagine building a workflow where Ansible provisions cloud VMs, configures networking, installs application dependencies, and then deploys services.

A project like setting up a CI/CD environment is a classic case: Ansible provisions Jenkins nodes, installs required plugins, and manages updates automatically. The strength is in orchestrating multiple systems end-to-end, not just one piece of infrastructure.

49. How do you handle failures gracefully?

Ansible offers several methods to prevent crashes during a play. You can mark tasks as non-fatal with ignore_errors or failed_when: false if failure is expected but not critical. Use blocks with rescue and always to implement error handling: the rescue section can roll back or clean up when something fails, and always ensures cleanup runs regardless of success or failure.

For longer-term stability, break large playbooks into smaller units, validate inputs, and run critical steps in check mode first. This way, even if something goes wrong, you keep control of the workflow instead of leaving infrastructure in a half-configured state.

Event-Driven Ansible (EDA)

This section focuses on Event-Driven Ansible (EDA), its components, and how it integrates with other systems.

50. What is Event-Driven Ansible (EDA), and when is it a better fit than scheduled runs?

Event-Driven Ansible (EDA) reacts in real time. Instead of running playbooks on a schedule, it listens for events, such as an alert or a log entry, and triggers automation immediately. 

It’s preferable when speed matters, such as handling incidents or scaling services instantly, rather than waiting for a cron job.

51. What are the key parts of a rulebook (sources, rules, conditions, actions)?

A rulebook is how you define EDA logic. The source is where events come from. Rules check those events against conditions. If a condition matches, an action often runs an Ansible playbook. Together, they’re the “if this, then that” of Ansible automation.

52. Give two common event sources you might connect to EDA

Monitoring tools, such as Prometheus or a SIEM, can feed alerts into EDA. You can also use a message bus, such as Kafka or RabbitMQ. Anything that produces an event stream can become a trigger.

53. What is a Decision Environment, and how does it differ from an Execution Environment?

A Decision Environment is the runtime for EDA rulebooks. It handles the logic of listening, filtering, and deciding what to do. An Execution Environment is where playbooks actually run. Think of it as: decision first, then execution.

54. How would you integrate EDA to remediate an alert from monitoring or SIEM?

You connect your monitoring system as the event source. The alert becomes an event, a rule matches it, and the action could run a playbook to restart a service, scale pods, or notify a team. It closes the loop from detection to response automatically.

55. How do you observe and audit EDA (event tracing, idempotent actions)?

EDA provides event tracing, allowing you to view the events received, the rules that were fired, and the actions that were executed. Actions should also be idempotent, meaning they can run multiple times safely. That way, you know exactly what happened and can trust the automation history.

How can Spacelift help you with Ansible projects?

Spacelift’s vibrant ecosystem and excellent GitOps flow are helpful for managing and orchestrating Ansible. By introducing Spacelift on top of Ansible, you can easily create custom workflows based on pull requests and apply any necessary compliance checks for your organization.

Another advantage of using Spacelift is that you can manage infrastructure tools like Ansible, Terraform, Pulumi, AWS CloudFormation, and even Kubernetes from the same place and combine their stacks with building workflows across tools.

Our latest Ansible enhancements solve three of the biggest challenges engineers face when they are using Ansible:

  • Having a centralized place in which you can run your playbooks
  • Combining IaC with configuration management to create a single workflow
  • Getting insights into what ran and where

Provisioning, configuring, governing, and even orchestrating your containers can be performed with a single workflow, separating the elements into smaller chunks to identify issues more easily.

Would you like to see this in action, or just get a tl;dr? Check out this video showing you Spacelift’s Ansible functionality:

ansible product video thumbnail

If you want to learn more about using Spacelift with Ansible, check our documentation, read our Ansible guide, or book a demo with one of our engineers.

Key points

This article provides you with a solid grounding in Ansible, so you should be well-prepared for your Ansible interview questions.

When using the guide, remember to focus on the principles behind each concept, as interviewers often value your reasoning and problem-solving approach over rote memorization.

Additionally, keep this guide and other tutorials from our blog nearby and refer to the official Ansible documentation for the most up-to-date information.

Manage Ansible better with Spacelift

Managing large-scale playbook execution is hard. Spacelift enables you to automate Ansible playbook execution with visibility and control over resources, and seamlessly link provisioning and configuration workflows.

Learn more

The Practitioner’s Guide to Scaling Infrastructure as Code

Transform your IaC management to scale

securely, efficiently, and productively

into the future.

ebook global banner
Share your data and download the guide