G
GuideDevOps
Lesson 12 of 14

Ansible Best Practices

Part of the Ansible tutorial series.

The Golden Rules of Automation

Writing a playbook that "gets the job done" is easy. Writing a playbook that operates safely across 10,000 servers, is easily read by junior engineers, and integrates silently into CI/CD requires strict discipline.

Follow these 8 golden rules to elevate your configuration management from scripts to software engineering.


1. Always use Roles (Never giant playbooks)

Do not put complex logic in a playbook.yml file.

Playbooks should look like simple, high-level roadmaps that map Inventory groups to specific Roles.

❌ Bad: 500 lines of apt, copy, and service tasks in deploy.yml.

✅ Good:

---
- name: Deploy Web Infrastructure
  hosts: webservers
  roles:
    - common_baseline
    - datadog_agent
    - custom_nginx

If you utilize roles correctly, you can reuse common_baseline on every server you ever launch without rewriting code.


2. Name Every Single Task

Every task and play must have a name attribute. Without names, debugging failures in a terminal with 4,000 lines of output is impossible.

❌ Bad:

- copy:
    src: nginx.conf
    dest: /etc/nginx/nginx.conf

Console output when running: TASK [copy] *******

✅ Good:

- name: Deploy tailored Nginx configuration file
  copy:
    src: nginx.conf
    dest: /etc/nginx/nginx.conf

Console output when running: TASK [Deploy tailored Nginx configuration file] *******


3. Keep Things Idempotent (The Prime Directive)

A properly written playbook must never alter systems that are already in the correct state.

If you hit <Up Arrow> + <Enter> in your terminal to run the exact same ansible-playbook command twice in a row:

  1. The first run should show changed: X.
  2. The second run should show changed: 0.

❌ Bad: Using command or shell modules unnecessarily.

- name: Create backup directory
  command: mkdir -p /opt/backup
# Every time this runs, it shows as 'changed' (yellow), blinding you to actual changes.

✅ Good: Using native core modules whenever possible.

- name: Ensure backup directory exists
  file:
    path: /opt/backup
    state: directory
# Ansible checks if it exists. If it does, outputs 'ok' (green).

4. Format YAML like a Pro

Ansible YAML should be highly legible. Always utilize native YAML dictionary formatting, not the legacy "inline string" format. Your linter (ansible-lint) will usually enforce this.

❌ Bad:

- name: Install Apache
  apt: name=apache2 state=present update_cache=yes force_apt_get=yes

✅ Good:

- name: Install Apache
  apt: 
    name: apache2 
    state: present 
    update_cache: yes 
    force_apt_get: yes

5. Clean up your Variables

Ansible's variable precedence is a highly complex 22-tier system. If you start defining variables everywhere (Inventory, Playbook vars, command line), you will go insane trying to figure out where a value came from.

  1. roles/defaults/main.yml: Use this for standard values. It sets a baseline.
  2. group_vars/ directory: Use this when you need an environment override (e.g., overriding the roles/default because Staging uses a different API Key than Production).
  3. Avoid specific host_vars: Configuration should be identical across a fleet. If you rely heavily on host_vars, your infrastructure is treating servers like "pets" rather than interchangeable "cattle."

6. Commit Your requirements.yml

Your automation logic lives in Git. If you rely on community modules (ansible-galaxy), those dependencies must be version controlled in a requirements.yml file alongside your code.

Always pin your versions.

❌ Bad:

roles:
  - name: geerlingguy.docker
# Downloads latest. Someday latest will have a breaking change and destroy your build.

✅ Good:

roles:
  - name: geerlingguy.docker
    version: "7.1.0"

7. Use Handlers, Not Sequential Restarts

Services should only restart when configuration legitimately updates. If you explicitly add a service: state=restarted task sequentially in your playbook, that service will reboot every single time you run Ansible.

Use the notify directive and place the restart logic inside a Handler. The Handler will only flush at the very end of the play, and only if explicitly triggered by a changed task.


8. Encrypt Secrets Immediately

Never commit a database password, AWS Access Key, or SSL private key to Git in plain text. Always use Ansible Vault.

If you're using CI/CD pipelines, inject the Vault password as a secure environment variable on your build runner, allowing Ansible to decrypt the secrets autonomously during deployment.