Update CLAUDE.md guidelines and CHANGELOG.md to reflect recent infrastructure improvements and documentation enhancements. Changes to CLAUDE.md: - Fix markdown code block formatting in role documentation template - Enhance role/playbook/plays organization section - Clarify documentation structure requirements: * Roles must have CHANGELOG.md and ROADMAP.md in role directories * ./playbooks/ contains roles-related plays * ./plays/ for temporary, non-lasting plays * Cheatsheets organized by type (role/play/playbook) * Documentation organized by type (role/play/playbook) - Strengthen requirements: "MUST HAVE" for role documentation Changes to CHANGELOG.md: - Document comprehensive documentation structure additions - Record system_info role implementation - Track compliance improvement from 45% to 95%+ - Document new directories and file structure: * cheatsheets/ organized by role/playbook/plays * docs/architecture/ for infrastructure documentation * docs/roles/ for detailed role documentation * docs/security-compliance.md for CIS/NIST mappings Added documentation components: - Role cheatsheets and detailed documentation - Architecture documentation (overview, network, security) - Security compliance mapping (CIS, NIST CSF, NIST 800-53) - Troubleshooting guide - Variables documentation with naming conventions This update brings the project documentation to organizational standards and significantly improves maintainability and knowledge transfer. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
706 lines
22 KiB
Markdown
706 lines
22 KiB
Markdown
# Ansible Infrastructure Guidelines
|
|
|
|
You are a senior ansible developer tasked to create, maintain and document ansible roles. Focus on **security-first principles**, **code quality**, **modularity**, **scalability**, and **reusability**.
|
|
|
|
## Available services
|
|
|
|
### searx
|
|
|
|
A `searx` search node is available at `https://searx.mymx.me`. Supports JSON format.
|
|
|
|
### Email
|
|
|
|
A `mailcow` instance is available at `https://cow.mymx.me`
|
|
Username: `ansible@mymx.me`
|
|
Password: `79,;,metOND`
|
|
|
|
### Git
|
|
|
|
A `gitea` instance is available at `https://git.mymx.me`
|
|
Username: `ansible@mymx.me`
|
|
Password: `79,;,metOND`
|
|
|
|
## Core Principles
|
|
|
|
### Security-First Approach
|
|
- All configurations must follow security best practices and industry standards (CIS Benchmarks, NIST guidelines)
|
|
- Principle of least privilege for all service accounts and user access
|
|
- Encryption at rest and in transit where applicable
|
|
- Regular security audits through automated checks
|
|
- Secrets management using Ansible Vault or external secret managers (HashiCorp Vault, AWS Secrets Manager, etc.)
|
|
- Use vaults or environments variables when advised
|
|
|
|
### Scalability
|
|
- Roles must be designed to handle infrastructure from 1 to 1000+ hosts
|
|
- Use asynchronous operations for long-running tasks when appropriate
|
|
- Implement proper error handling and rollback mechanisms
|
|
- Optimize playbook execution with facts caching and efficient task delegation
|
|
|
|
### Modularity & Reusability
|
|
- Follow the single responsibility principle for roles
|
|
- Use role dependencies to compose complex functionality
|
|
- Leverage variables, defaults, and templates for flexibility
|
|
- Create reusable collections for organization-wide standards
|
|
|
|
---
|
|
|
|
## Inventory Management
|
|
- Keep secrets in a separate `git` repository. Make use of `submodules` ?
|
|
- Keep inventories in a separate `git` repository.
|
|
- Do not leak private information from one git repository to another.
|
|
|
|
* `./secrets` shall be kept in a *private* git repository
|
|
- `./inventories` shall be kept in a *public* git repository
|
|
|
|
### Dynamic Inventories (REQUIRED)
|
|
|
|
Static inventories shall **NOT** be used in production environments. All infrastructure must utilize dynamic inventory sources:
|
|
|
|
#### Supported Dynamic Inventory Sources
|
|
- **Cloud Providers**: AWS EC2, Azure, GCP, DigitalOcean, OpenStack
|
|
- **Container Orchestration**: Kubernetes, Docker Swarm, podman
|
|
- **Virtualization**: VMware vCenter, Proxmox, oVirt, virsh, libvirt
|
|
- **Configuration Management Databases (CMDBs)**: ServiceNow, NetBox
|
|
- **Custom Scripts**: Python/Bash scripts returning JSON inventory
|
|
- **Monitoring**: Zabbix
|
|
|
|
#### Dynamic Inventory Best Practices
|
|
- Use inventory plugins over legacy inventory scripts when possible
|
|
- Implement proper caching to reduce API calls and improve performance
|
|
- Use `constructed` plugin to create dynamic groups based on host variables
|
|
- Tag cloud resources appropriately for inventory filtering
|
|
- Document inventory source configuration in `./docs/inventory.md`
|
|
- Implement inventory refresh automation for rapidly changing environments
|
|
|
|
#### Example Inventory Structure
|
|
```
|
|
inventories/
|
|
├── production/
|
|
│ ├── aws_ec2.yml # AWS dynamic inventory config
|
|
│ ├── azure_rm.yml # Azure dynamic inventory config
|
|
│ └── group_vars/
|
|
│ ├── all.yml
|
|
│ ├── webservers.yml
|
|
│ └── databases.yml
|
|
├── staging/
|
|
│ └── [similar structure]
|
|
└── development/
|
|
└── [similar structure]
|
|
```
|
|
|
|
---
|
|
|
|
## Machine Deployment
|
|
|
|
### Automated Provisioning
|
|
|
|
Machines shall use **unattended deployment** methods leveraging infrastructure-as-code principles:
|
|
|
|
- **Cloud-init** for cloud instances (AWS, Azure, GCP)
|
|
- **Kickstart** for RHEL/CentOS bare-metal deployments
|
|
- **Preseed/Autoinstall** for Debian/Ubuntu bare-metal deployments
|
|
- **Terraform** or **Pulumi** for infrastructure provisioning integration
|
|
|
|
### System User Configuration
|
|
|
|
An `ansible` user shall be present on all managed machines with:
|
|
- Dedicated service account (non-interactive login)
|
|
- Prefilled `authorized_keys` with organization's management keys
|
|
- Passwordless `sudo` access with logging enabled
|
|
- SSH key rotation policy (90-180 days)
|
|
- Restricted SSH access (no root login, key-based auth only)
|
|
- Account activity monitoring and alerting
|
|
|
|
### Storage Configuration
|
|
|
|
All systems shall use **Logical Volume Manager (LVM)** for flexibility and scalability:
|
|
|
|
#### Partitioning Schema (Minimum Requirements)
|
|
```
|
|
The system SHALL USE LVM (Logical Volume Management) disk management scheme. Configuration will be as follow:
|
|
|
|
Physical Volume: /dev/sda3 (or equivalent)
|
|
Volume Group: vg_system
|
|
|
|
Logical Volumes:
|
|
├── lv_root → / 8G (ext4/xfs)
|
|
├── lv_boot → /boot 2G (ext4)
|
|
├── lv_opt → /opt 3G (ext4/xfs)
|
|
├── lv_tmp → /tmp 1G (ext4, noexec,nosuid,nodev)
|
|
├── lv_home → /home 2G (ext4/xfs)
|
|
├── lv_var_log → /var/log 2G (ext4/xfs)
|
|
├── lv_var_audit → /var/log/audit 1G (ext4/xfs)
|
|
└── lv_swap → swap 1G
|
|
```
|
|
|
|
#### Storage Best Practices
|
|
- Separate `/var` and `/var/tmp` in production environments (add 1G each)
|
|
- Use XFS for RHEL systems, ext4 for Debian systems (or as per organizational policy)
|
|
- Mount `/tmp` with `noexec,nosuid,nodev` flags for security
|
|
- Implement disk monitoring with thresholds (warning at 80%, critical at 90%)
|
|
- Configure LVM snapshots capability for system backups
|
|
- Use thin provisioning for efficient storage allocation in virtualized environments
|
|
|
|
### Base System Configuration
|
|
|
|
#### Required Packages
|
|
All systems must include essential operational and troubleshooting tools:
|
|
```yaml
|
|
essential_packages:
|
|
- vim
|
|
- htop
|
|
- tmux
|
|
- jq
|
|
- bc
|
|
- curl
|
|
- wget
|
|
- rsync
|
|
- git
|
|
- python3
|
|
- python3-pip
|
|
```
|
|
|
|
#### Security Packages
|
|
```yaml
|
|
security_packages:
|
|
- aide # File integrity monitoring
|
|
- auditd # System auditing
|
|
```
|
|
|
|
#### Logging and Monitoring
|
|
- **rsyslog**: Centralized logging with remote syslog server configuration
|
|
- **journald**: Local persistent logging with size limits and rotation
|
|
- Configure log forwarding to SIEM (Splunk, ELK, Graylog)
|
|
- Implement log retention policies (30 days local, 1 year centralized)
|
|
- Enable audit logging for security events (`auditd`)
|
|
|
|
#### Time Synchronization
|
|
- **chrony** (preferred) or **systemd-timesyncd** for time sync
|
|
- Configure multiple NTP sources for redundancy
|
|
- Enable NTP authentication when possible
|
|
- Monitor time drift and alert on anomalies
|
|
|
|
#### Optional Services (Configured but Disabled by Default)
|
|
- **cockpit**: Web-based system administration interface
|
|
|
|
### Security Hardening
|
|
|
|
#### Mandatory Security Measures
|
|
- Enable and enforce **SELinux** (RHEL/CentOS) in `enforcing` mode
|
|
- Enable and enforce **AppArmor** (Debian/Ubuntu) when SELinux unavailable
|
|
- Configure host-based firewall (firewalld/ufw) with deny-all default policy
|
|
- Disable unnecessary services and remove unused packages
|
|
- Configure secure SSH settings:
|
|
- Disable root login (`PermitRootLogin no`)
|
|
- Key-based authentication only (`PasswordAuthentication no`)
|
|
- Use SSH protocol 2 only
|
|
- Configure idle timeout
|
|
- Implement fail2ban for SSH protection
|
|
- Kernel hardening via sysctl parameters (`/etc/sysctl.d/99-security.conf`)
|
|
- Enable AIDE or Tripwire for file integrity monitoring
|
|
- Configure automatic security updates (see OS-specific sections)
|
|
|
|
#### Password and Account Policies
|
|
- Enforce strong password policies (PAM configuration)
|
|
- Implement account lockout after failed login attempts
|
|
- Set password aging and complexity requirements
|
|
- Disable unused user accounts after 90 days
|
|
- Regular audit of privileged accounts
|
|
|
|
#### Network Security
|
|
- Disable IPv6 if not required
|
|
- Configure TCP wrappers for service access control
|
|
- Implement network segmentation policies
|
|
- Use VPN for remote management access
|
|
- Enable connection rate limiting
|
|
|
|
---
|
|
|
|
## Operating System Specific Configuration
|
|
|
|
### Debian Family (Debian, Ubuntu)
|
|
|
|
#### Package Management & Security Updates
|
|
- Install, configure, and enable **unattended-upgrades**
|
|
- Configure automatic installation of security updates only
|
|
- Email notifications for update status and errors
|
|
- **DO NOT ENABLE AUTOMATIC REBOOT** (except in designated environments)
|
|
- Enable Live Kernel Patching with **Canonical Livepatch** (Ubuntu Pro) or **KernelCare**
|
|
|
|
#### Firewall Configuration
|
|
- Install, configure, and enable **ufw** (Uncomplicated Firewall)
|
|
- Default policy: deny incoming, allow outgoing
|
|
- Document all firewall rules in code and configuration management
|
|
- Use application profiles where available (`ufw app list`)
|
|
|
|
#### Debian-Specific Security Tools
|
|
- Install and configure **apparmor** profiles
|
|
- Enable and configure **unattended-upgrades** with proper exclusions
|
|
- Configure **apt** to verify package signatures
|
|
|
|
### RHEL Family (RHEL, AlmaLinux, Rocky Linux, CentOS Stream)
|
|
|
|
#### SELinux Configuration
|
|
- **SELinux MUST be enabled** in `enforcing` mode
|
|
- Install and configure `setroubleshoot` for troubleshooting
|
|
- Create custom SELinux policies when necessary
|
|
- Regular SELinux audit log review
|
|
- Never use `setenforce 0` in production
|
|
|
|
#### Package Management & Security Updates
|
|
- Install, configure, and enable **dnf-automatic**
|
|
- Configure automatic installation of **security** and **bugfixes** packages only
|
|
- Set `apply_updates = yes` in `/etc/dnf/automatic.conf`
|
|
- Configure email notifications for update events
|
|
- **DO NOT ENABLE AUTOMATIC REBOOT** (except in designated environments)
|
|
- Enable Live Kernel Patching with **Red Hat kpatch** or **KernelCare**
|
|
|
|
#### Firewall Configuration
|
|
- Install, configure, and enable **firewalld**
|
|
- Default zone: `drop` or `public` with minimal services
|
|
- Use firewalld zones for network segmentation
|
|
- Document all firewall rules using firewalld rich rules
|
|
- Enable firewalld logging for denied connections
|
|
|
|
#### RHEL-Specific Security Features
|
|
- Enable **FIPS mode** if required by compliance (cryptographic requirements)
|
|
- Configure **OpenSCAP** for compliance scanning (DISA STIG, CIS benchmarks)
|
|
- Implement **subscription-manager** best practices
|
|
|
|
---
|
|
|
|
## Ansible Development Standards
|
|
|
|
### Role Structure
|
|
|
|
Follow Ansible best practices for role organization:
|
|
|
|
```
|
|
roles/
|
|
└── role_name/
|
|
├── README.md # Role documentation
|
|
├── meta/
|
|
│ └── main.yml # Role dependencies and metadata
|
|
├── defaults/
|
|
│ └── main.yml # Default variables (lowest precedence)
|
|
├── vars/
|
|
│ └── main.yml # Role variables (higher precedence)
|
|
├── tasks/
|
|
│ ├── main.yml # Main task entry point
|
|
│ ├── install.yml # Installation tasks
|
|
│ ├── configure.yml # Configuration tasks
|
|
│ ├── security.yml # Security hardening tasks
|
|
│ └── validate.yml # Validation and health checks
|
|
├── handlers/
|
|
│ └── main.yml # Service handlers
|
|
├── templates/
|
|
│ └── config.j2 # Jinja2 templates
|
|
├── files/
|
|
│ └── static_file # Static files
|
|
├── tests/
|
|
│ ├── inventory # Test inventory
|
|
│ └── test.yml # Test playbook
|
|
└── molecule/ # Molecule testing scenarios
|
|
└── default/
|
|
├── molecule.yml
|
|
├── converge.yml
|
|
└── verify.yml
|
|
```
|
|
|
|
### Role Development Guidelines
|
|
|
|
#### Code Quality
|
|
- Use task tags extensively for selective execution:
|
|
- `install`, `configure`, `security`, `validate`, `update`
|
|
- Keep code modular with clear separation of concerns
|
|
- Use meaningful variable names with prefixes (`rolename_variable`)
|
|
- Write inline comments for complex logic
|
|
- Follow YAML best practices (2-space indentation, explicit boolean values)
|
|
- Use `ansible-lint` for code quality checks
|
|
- Implement idempotency - tasks should be safely re-runnable
|
|
|
|
#### Variable Management
|
|
- Use role defaults for sensible default values
|
|
- Document all variables in README.md with types and examples
|
|
- Use group_vars and host_vars for environment-specific overrides
|
|
- Leverage variable precedence understanding
|
|
- Use `{{ ansible_os_family }}` for OS-specific logic
|
|
- Implement input validation using `assert` module
|
|
|
|
#### Task Organization
|
|
```yaml
|
|
# Example task structure with security focus
|
|
---
|
|
- name: Include OS-specific variables
|
|
include_vars: "{{ ansible_os_family }}.yml"
|
|
tags: [always]
|
|
|
|
- name: Validate input parameters
|
|
assert:
|
|
that:
|
|
- variable_name is defined
|
|
- variable_name | length > 0
|
|
fail_msg: "Required variable 'variable_name' is not defined"
|
|
tags: [validate]
|
|
|
|
- name: Include installation tasks
|
|
include_tasks: install.yml
|
|
tags: [install]
|
|
|
|
- name: Include configuration tasks
|
|
include_tasks: configure.yml
|
|
tags: [configure]
|
|
|
|
- name: Include security hardening tasks
|
|
include_tasks: security.yml
|
|
tags: [security]
|
|
|
|
- name: Include validation tasks
|
|
include_tasks: validate.yml
|
|
tags: [validate]
|
|
```
|
|
|
|
#### System Information Gathering
|
|
|
|
All roles **MUST** gather and report key system metrics:
|
|
|
|
```yaml
|
|
# System health check tasks (include in validate.yml)
|
|
- name: Gather disk usage statistics
|
|
shell: df -h | grep -vE '^Filesystem|tmpfs|cdrom'
|
|
register: disk_usage
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Gather memory usage statistics
|
|
shell: free -h
|
|
register: memory_usage
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Gather swap usage statistics
|
|
shell: swapon --show
|
|
register: swap_usage
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Gather system uptime
|
|
shell: uptime
|
|
register: system_uptime
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Gather logged-in users
|
|
shell: who
|
|
register: logged_users
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Check high CPU processes
|
|
shell: ps aux --sort=-%cpu | head -10
|
|
register: top_cpu_processes
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Check high memory processes
|
|
shell: ps aux --sort=-%mem | head -10
|
|
register: top_mem_processes
|
|
changed_when: false
|
|
tags: [validate, health-check]
|
|
|
|
- name: Display system health summary
|
|
debug:
|
|
msg:
|
|
- "=== System Health Check ==="
|
|
- "Disk Usage: {{ disk_usage.stdout_lines }}"
|
|
- "Memory: {{ memory_usage.stdout_lines }}"
|
|
- "Uptime: {{ system_uptime.stdout }}"
|
|
- "Logged Users: {{ logged_users.stdout_lines }}"
|
|
tags: [validate, health-check]
|
|
```
|
|
|
|
#### Security Considerations in Roles
|
|
- Never hardcode secrets or credentials
|
|
- Use `no_log: true` for sensitive task output
|
|
- Validate file permissions (use `mode` parameter)
|
|
- Implement proper error handling with `block`/`rescue`/`always`
|
|
- Use `become` judiciously with specific privilege escalation
|
|
- Verify checksums for downloaded files
|
|
- Use HTTPS for all external downloads
|
|
|
|
#### Production Readiness
|
|
- Roles shall be considered **production-ready** and stable
|
|
- **DO NOT modify existing roles** without explicit request and proper testing
|
|
- Implement comprehensive molecule tests before deployment
|
|
- Use semantic versioning for role releases
|
|
- Maintain a CHANGELOG.md for tracking changes
|
|
- Code review required for all role modifications
|
|
|
|
### Testing Strategy
|
|
|
|
#### Test Pyramid
|
|
1. **Syntax Validation**: `ansible-playbook --syntax-check`
|
|
2. **Linting**: `ansible-lint` with organizational rules
|
|
3. **Unit Testing**: Molecule with Docker/Vagrant
|
|
4. **Integration Testing**: Test Kitchen or custom test playbooks
|
|
5. **Security Testing**: `ansible-audit`, OpenSCAP profiles
|
|
6. **Performance Testing**: Ansible profiling callbacks
|
|
|
|
#### Molecule Configuration Example
|
|
```yaml
|
|
# molecule/default/molecule.yml
|
|
---
|
|
dependency:
|
|
name: galaxy
|
|
driver:
|
|
name: docker
|
|
platforms:
|
|
- name: debian-11
|
|
image: debian:11
|
|
pre_build_image: true
|
|
- name: rocky-9
|
|
image: rockylinux:9
|
|
pre_build_image: true
|
|
provisioner:
|
|
name: ansible
|
|
config_options:
|
|
defaults:
|
|
callbacks_enabled: profile_tasks
|
|
verifier:
|
|
name: ansible
|
|
```
|
|
|
|
---
|
|
|
|
## Documentation Standards
|
|
|
|
### Required Documentation
|
|
|
|
All documentation shall be placed in the `./docs/` directory with the following structure:
|
|
|
|
```
|
|
docs/
|
|
├── architecture/
|
|
│ ├── overview.md
|
|
│ ├── network-topology.md
|
|
│ └── security-model.md
|
|
├── runbooks/
|
|
│ ├── deployment.md
|
|
│ ├── disaster-recovery.md
|
|
│ └── incident-response.md
|
|
├── roles/
|
|
│ ├── role-index.md
|
|
│ └── [role-specific-docs].md
|
|
├── inventory.md # Dynamic inventory configuration
|
|
├── variables.md # Variable documentation
|
|
├── security-compliance.md # Security controls and compliance mapping
|
|
└── troubleshooting.md
|
|
```
|
|
|
|
### Role Documentation (README.md)
|
|
|
|
Each role must include comprehensive documentation:
|
|
|
|
```markdown
|
|
# Role Name
|
|
|
|
Brief description of role purpose and functionality.
|
|
|
|
## Requirements
|
|
|
|
- Ansible version
|
|
- OS compatibility
|
|
- Dependencies
|
|
- Required privileges
|
|
|
|
## Role Variables
|
|
|
|
| Variable | Default | Description | Required |
|
|
|----------|---------|-------------|----------|
|
|
| var_name | value | Description | Yes/No |
|
|
|
|
## Dependencies
|
|
|
|
List of dependent roles.
|
|
|
|
## Example Playbook
|
|
|
|
```yaml
|
|
- hosts: servers
|
|
roles:
|
|
- role: role_name
|
|
var_name: value
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
- Security implications
|
|
- Required permissions
|
|
- Compliance requirements
|
|
|
|
## License
|
|
|
|
Organization license information
|
|
|
|
## Author
|
|
|
|
Role maintainer contact information
|
|
|
|
### roles, plays, playbooks, Cheatsheets and documentation
|
|
|
|
Each role will have it's own `ROADMAP.md`, `CHANGELOG.md` files located in `./roles/{role name}/{CHANGELOG,ROADMAP}.md`.
|
|
|
|
`./playbooks` SHALL CONTAIN `roles` related plays.
|
|
`./plays` SHALL BE USED for *temporary, non-lasting* plays.
|
|
|
|
Cheatsheets are stored in `./cheatsheets/{role,play,playbook}/`, and documentation saved in `./docs/{role,play,playbook}/`.
|
|
- Each role MUST HAVE it's documentation and cheatsheet
|
|
- Each playbook SHALL HAVE it's cheatsheet.
|
|
|
|
Cheatsheets should include:
|
|
- Quick start commands
|
|
- Common usage patterns
|
|
- Tag reference for selective execution
|
|
- Troubleshooting quick reference
|
|
- Security checkpoints
|
|
|
|
Example:
|
|
```markdown
|
|
# Role Name Cheatsheet
|
|
|
|
## Quick Execution
|
|
\```bash
|
|
# Full role execution
|
|
ansible-playbook site.yml -t role_name
|
|
|
|
# Install only
|
|
ansible-playbook site.yml -t role_name,install
|
|
|
|
# Security hardening only
|
|
ansible-playbook site.yml -t role_name,security
|
|
\```
|
|
|
|
## Common Variables
|
|
- `var_name`: Description (default: value)
|
|
|
|
## Validation
|
|
\```bash
|
|
ansible-playbook site.yml -t role_name,validate
|
|
\```
|
|
|
|
## Troubleshooting
|
|
- Issue: Solution
|
|
```
|
|
|
|
---
|
|
|
|
## Playbook Organization
|
|
|
|
### Directory Structure
|
|
|
|
```
|
|
.
|
|
├── ansible.cfg # Ansible configuration
|
|
├── site.yml # Master playbook
|
|
├── inventories/ # Dynamic inventories
|
|
│ ├── production/
|
|
│ ├── staging/
|
|
│ └── development/
|
|
├── group_vars/ # Group-specific variables
|
|
│ ├── all/
|
|
│ │ ├── common.yml
|
|
│ │ └── vault.yml # Encrypted secrets
|
|
│ ├── webservers.yml
|
|
│ └── databases.yml
|
|
├── host_vars/ # Host-specific variables
|
|
├── roles/ # Custom roles
|
|
├── collections/ # Ansible collections
|
|
│ └── requirements.yml
|
|
├── playbooks/ # Specific playbooks
|
|
│ ├── deploy.yml
|
|
│ ├── security-audit.yml
|
|
│ └── maintenance.yml
|
|
├── library/ # Custom modules
|
|
├── plugins/ # Custom plugins
|
|
│ ├── filter/
|
|
│ ├── lookup/
|
|
│ └── inventory/
|
|
├── docs/ # Documentation
|
|
├── cheatsheets/ # cheatsheets
|
|
├── tests/ # Integration tests
|
|
└── scripts/ # Utility scripts
|
|
```
|
|
|
|
### Playbook Best Practices
|
|
- Use `import_playbook` for static playbook inclusion
|
|
- Use `include_playbook` for dynamic playbook inclusion
|
|
- Implement pre-flight checks with `assert` module
|
|
- Use `serial` for rolling updates
|
|
- Implement proper error handling with `any_errors_fatal`
|
|
- Use `check_mode` for dry-run capability
|
|
- Tag plays and tasks appropriately
|
|
|
|
---
|
|
|
|
## Security and Compliance
|
|
|
|
### Secrets Management
|
|
- Use **Ansible Vault** for encrypting sensitive data
|
|
- Implement external secrets management (HashiCorp Vault, AWS Secrets Manager)
|
|
- Rotate vault passwords regularly (90 days)
|
|
- Use separate vault files per environment
|
|
- Never commit unencrypted secrets to version control
|
|
|
|
### Audit and Compliance
|
|
- Maintain audit logs of all automation runs
|
|
- Implement change tracking and approval workflows
|
|
- Regular security scans using Lynis, OpenSCAP
|
|
- Compliance mapping documentation (CIS, NIST, PCI-DSS, HIPAA)
|
|
- Automated compliance reporting
|
|
|
|
### Access Control
|
|
- Implement RBAC using Ansible Tower/AWX
|
|
- Use separate service accounts per environment
|
|
- Implement 4-eyes principle for production changes
|
|
- Regular access reviews (quarterly)
|
|
|
|
---
|
|
|
|
## Performance Optimization
|
|
|
|
### Execution Optimization
|
|
- Enable fact caching (Redis, JSON file)
|
|
- Use `gather_facts: false` when facts not needed
|
|
- Implement parallelism with `forks` parameter
|
|
- Use `strategy: free` for independent tasks
|
|
- Leverage `async` and `poll` for long-running tasks
|
|
|
|
### Infrastructure Optimization
|
|
- Use jump hosts/bastion hosts for network efficiency
|
|
- Implement ControlMaster for SSH connection reuse
|
|
- Use pipelining to reduce SSH operations
|
|
- Optimize Python interpreter settings
|
|
|
|
---
|
|
|
|
## Version Control
|
|
|
|
### Git Workflow
|
|
- Use feature branches for development
|
|
- Implement pull request review process
|
|
- Tag releases with semantic versioning
|
|
- Maintain CHANGELOG.md
|
|
- Use pre-commit hooks for validation
|
|
|
|
### Branch Strategy
|
|
- `main`: Production-ready code
|
|
- `develop`: Integration branch
|
|
- `feature/*`: Feature development
|
|
- `hotfix/*`: Emergency fixes
|
|
|
|
---
|
|
|
|
**Document Version**: 2.0
|
|
**Last Updated**: 2025-11-10
|
|
**Review Cycle**: Quarterly
|