This commit addresses the critical issues identified in the role analysis: ## Security Improvements ### Remove Hardcoded Secrets (deploy_linux_vm) - Replaced hardcoded SSH key in defaults/main.yml with vault variable reference - Replaced hardcoded root password with vault variable reference - Created vault.yml.example to document secret structure - Updated README.md with comprehensive security best practices section - Added documentation for Ansible Vault, external secret managers, and environment variables - Included SSH key generation and password generation best practices ## Role Documentation & Planning ### CHANGELOG.md Files - Created comprehensive CHANGELOG.md for deploy_linux_vm role - Documented v1.0.0 initial release features - Tracked v1.0.1 security improvements - Created comprehensive CHANGELOG.md for system_info role - Documented v1.0.0 initial release - Tracked v1.0.1 critical bug fixes (block-level failed_when, Jinja2 templates, OS variables) ### ROADMAP.md Files - Created detailed ROADMAP.md for deploy_linux_vm role - Version 1.1.0: Security & compliance hardening (Q1 2026) - Version 1.2.0: Multi-distribution support (Q2 2026) - Version 1.3.0: Advanced features (Q3 2026) - Version 2.0.0: Enterprise features (Q4 2026) - Created detailed ROADMAP.md for system_info role - Version 1.1.0: Enhanced monitoring & metrics (Q1 2026) - Version 1.2.0: Cloud & container support (Q2 2026) - Version 1.3.0: Hardware & firmware deep dive (Q3 2026) - Version 2.0.0: Visualization & reporting (Q4 2026) ## Error Handling Enhancements ### deploy_linux_vm Role - Block/Rescue/Always Pattern - Wrapped deployment tasks in comprehensive error handling block - Block section: - Pre-deployment VM name collision check - Enhanced IP address acquisition with better error messages - Descriptive failure messages for troubleshooting - Rescue section (automatic rollback): - Diagnostic information gathering - VM status checking - Attempted console log capture - Automatic VM destruction and cleanup - Disk image removal (primary, LVM, cloud-init ISO) - Detailed troubleshooting guidance - Always section: - Deployment logging to /var/log/ansible-vm-deployments.log - Success/failure tracking - Improved task FQCNs (ansible.builtin.*) ## Handlers Implementation ### deploy_linux_vm Role - Complete Handler Suite - VM Lifecycle Handlers: - restart vm, shutdown vm, destroy vm - Cloud-Init Handlers: - regenerate cloud-init iso (full rebuild and reattach) - Storage Handlers: - refresh libvirt storage pool - resize vm disk (with safe shutdown/start) - Network Handlers: - refresh network configuration - restart libvirt network - Libvirt Daemon Handlers: - restart libvirtd, reload libvirtd - Cleanup Handlers: - cleanup temporary files - remove cloud-init iso - Validation Handlers: - validate vm status - check connectivity ## Impact ### Security - Eliminates hardcoded secrets from version control - Implements industry best practices for secret management - Provides clear guidance for secure deployment ### Maintainability - CHANGELOGs enable version tracking and change auditing - ROADMAPs provide clear development direction and prioritization - Comprehensive error handling reduces debugging time - Handlers enable modular, reusable state management ### Reliability - Automatic rollback prevents partial deployments - Comprehensive error messages reduce MTTR - Handlers ensure consistent state management - Better separation of concerns ### Compliance - Aligns with CLAUDE.md security requirements - Implements proper secrets management per organizational policy - Provides audit trail through changelogs ## References - ROLE_ANALYSIS_AND_IMPROVEMENTS.md: Initial analysis document - CLAUDE.md: Organizational infrastructure standards 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
481 lines
14 KiB
Markdown
481 lines
14 KiB
Markdown
# Ansible Role: deploy_linux_vm
|
|
|
|
Deploy Linux virtual machines on KVM hypervisors with LVM storage configuration, security hardening, and cloud-init provisioning. This role supports multiple Linux distributions and implements CLAUDE.md security requirements including LVM partitioning and SSH hardening.
|
|
|
|
## Features
|
|
|
|
- **Multi-Distribution Support**: Debian, Ubuntu, RHEL, CentOS Stream, Rocky Linux, AlmaLinux, SLES, openSUSE
|
|
- **LVM Configuration**: Automatic LVM setup with meaningful volume groups and logical volumes per CLAUDE.md
|
|
- **Security Hardening**:
|
|
- SSH hardening with GSSAPI disabled
|
|
- SELinux enforcing (RHEL family)
|
|
- AppArmor enabled (Debian family)
|
|
- Firewall configuration (UFW/firewalld)
|
|
- Automatic security updates
|
|
- Audit daemon enabled
|
|
- **Cloud-Init**: Automated provisioning with distribution-specific configurations
|
|
- **Modular Design**: Tag-based execution for selective deployment stages
|
|
- **Production Ready**: Idempotent, well-tested, and CLAUDE.md compliant
|
|
|
|
## Requirements
|
|
|
|
### Hypervisor Requirements
|
|
|
|
- Ansible 2.12 or higher
|
|
- KVM/libvirt virtualization enabled
|
|
- Sufficient disk space in `/var/lib/libvirt/images`
|
|
- Network connectivity for cloud image downloads
|
|
|
|
### Supported Distributions (Guest VMs)
|
|
|
|
| Distribution | Versions | OS Family |
|
|
|--------------|----------|-----------|
|
|
| Debian | 11, 12 | debian |
|
|
| Ubuntu | 20.04 LTS, 22.04 LTS, 24.04 LTS | debian |
|
|
| RHEL | 8, 9 | rhel |
|
|
| CentOS Stream | 8, 9 | rhel |
|
|
| Rocky Linux | 8, 9 | rhel |
|
|
| AlmaLinux | 8, 9 | rhel |
|
|
| SLES | 15 | suse |
|
|
| openSUSE Leap | 15.5, 15.6 | suse |
|
|
|
|
## Role Variables
|
|
|
|
### Required Variables
|
|
|
|
| Variable | Required | Default | Description |
|
|
|----------|----------|---------|-------------|
|
|
| `deploy_linux_vm_os_distribution` | Yes | debian-12 | Distribution identifier (e.g., ubuntu-22.04, almalinux-9) |
|
|
|
|
### VM Configuration
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `deploy_linux_vm_name` | linux-guest | VM name in libvirt |
|
|
| `deploy_linux_vm_hostname` | linux-vm | VM hostname |
|
|
| `deploy_linux_vm_domain` | localdomain | Domain name |
|
|
| `deploy_linux_vm_vcpus` | 2 | Number of vCPUs |
|
|
| `deploy_linux_vm_memory_mb` | 2048 | RAM in MB |
|
|
| `deploy_linux_vm_disk_size_gb` | 30 | Primary disk size in GB |
|
|
|
|
### LVM Configuration
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `deploy_linux_vm_use_lvm` | true | Enable LVM configuration |
|
|
| `deploy_linux_vm_lvm_vg_name` | vg_system | Volume group name |
|
|
| `deploy_linux_vm_lvm_pv_device` | /dev/vdb | Physical volume device |
|
|
| `deploy_linux_vm_lvm_volumes` | (see defaults) | List of logical volumes per CLAUDE.md |
|
|
|
|
#### LVM Volumes (CLAUDE.md Compliance)
|
|
|
|
Default logical volumes created:
|
|
|
|
```yaml
|
|
deploy_linux_vm_lvm_volumes:
|
|
- { name: lv_opt, size: 3G, mount: /opt, fstype: ext4 }
|
|
- { name: lv_tmp, size: 1G, mount: /tmp, fstype: ext4, mount_options: noexec,nosuid,nodev }
|
|
- { name: lv_home, size: 2G, mount: /home, fstype: ext4 }
|
|
- { name: lv_var, size: 5G, mount: /var, fstype: ext4 }
|
|
- { name: lv_var_log, size: 2G, mount: /var/log, fstype: ext4 }
|
|
- { name: lv_var_tmp, size: 5G, mount: /var/tmp, fstype: ext4, mount_options: noexec,nosuid,nodev }
|
|
- { name: lv_var_audit, size: 1G, mount: /var/log/audit, fstype: ext4 }
|
|
- { name: lv_swap, size: 2G, mount: none, fstype: swap }
|
|
```
|
|
|
|
### SSH Configuration
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `deploy_linux_vm_ssh_permit_root_login` | no | Allow root SSH login |
|
|
| `deploy_linux_vm_ssh_password_authentication` | no | Allow password authentication |
|
|
| `deploy_linux_vm_ssh_gssapi_authentication` | no | **GSSAPI disabled per requirements** |
|
|
| `deploy_linux_vm_ssh_gssapi_cleanup_credentials` | no | GSSAPI cleanup |
|
|
| `deploy_linux_vm_ssh_max_auth_tries` | 3 | Maximum authentication attempts |
|
|
| `deploy_linux_vm_ssh_client_alive_interval` | 300 | SSH keepalive interval |
|
|
|
|
### Security Configuration
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `deploy_linux_vm_enable_firewall` | true | Enable firewall (UFW/firewalld) |
|
|
| `deploy_linux_vm_enable_selinux` | true | Enable SELinux (RHEL family) |
|
|
| `deploy_linux_vm_enable_apparmor` | true | Enable AppArmor (Debian family) |
|
|
| `deploy_linux_vm_enable_auditd` | true | Enable audit daemon |
|
|
| `deploy_linux_vm_enable_automatic_updates` | true | Enable automatic security updates |
|
|
| `deploy_linux_vm_automatic_reboot` | false | Auto-reboot after updates |
|
|
|
|
### User Configuration
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `deploy_linux_vm_ansible_user` | ansible | Service account username |
|
|
| `deploy_linux_vm_ansible_user_ssh_key` | (vault variable) | SSH public key for ansible user |
|
|
| `deploy_linux_vm_root_password` | (vault variable) | Root password (console access) |
|
|
|
|
**SECURITY NOTICE**: SSH keys and passwords should be stored in encrypted vault files, not in defaults.
|
|
|
|
## Security Best Practices
|
|
|
|
### Secrets Management
|
|
|
|
This role requires sensitive data (SSH keys, passwords) to be stored securely:
|
|
|
|
#### Option 1: Ansible Vault (Recommended for Small/Medium Deployments)
|
|
|
|
1. Create a vault file in your inventory:
|
|
|
|
```bash
|
|
# Create encrypted vault file
|
|
ansible-vault create inventories/production/group_vars/all/vault.yml
|
|
```
|
|
|
|
2. Add the required vault variables:
|
|
|
|
```yaml
|
|
---
|
|
# SSH public key for ansible user
|
|
vault_deploy_linux_vm_ansible_user_ssh_key: "ssh-ed25519 AAAAC3... ansible@automation"
|
|
|
|
# Root password for emergency console access
|
|
vault_deploy_linux_vm_root_password: "YourSecurePassword123!"
|
|
```
|
|
|
|
3. Reference vault variables in your playbook or group_vars:
|
|
|
|
```yaml
|
|
# inventories/production/group_vars/all/vars.yml
|
|
deploy_linux_vm_ansible_user_ssh_key: "{{ vault_deploy_linux_vm_ansible_user_ssh_key }}"
|
|
deploy_linux_vm_root_password: "{{ vault_deploy_linux_vm_root_password }}"
|
|
```
|
|
|
|
4. Run playbooks with vault password:
|
|
|
|
```bash
|
|
ansible-playbook site.yml --ask-vault-pass
|
|
# Or use a password file
|
|
ansible-playbook site.yml --vault-password-file ~/.vault_pass
|
|
```
|
|
|
|
#### Option 2: External Secret Managers (Recommended for Enterprise)
|
|
|
|
- **HashiCorp Vault**: Use `community.hashi_vault.vault_read` lookup plugin
|
|
- **AWS Secrets Manager**: Use `amazon.aws.aws_secret` lookup plugin
|
|
- **Azure Key Vault**: Use `azure.azcollection.azure_keyvault_secret` lookup plugin
|
|
- **CyberArk**: Use CyberArk Ansible plugins
|
|
|
|
Example with HashiCorp Vault:
|
|
|
|
```yaml
|
|
deploy_linux_vm_ansible_user_ssh_key: "{{ lookup('community.hashi_vault.vault_read', 'secret/data/ansible/ssh_key').data.public_key }}"
|
|
```
|
|
|
|
#### Option 3: Environment Variables
|
|
|
|
```bash
|
|
export ANSIBLE_VAULT_PASSWORD_FILE=~/.vault_pass
|
|
export DEPLOY_VM_SSH_KEY="ssh-ed25519 AAAAC3..."
|
|
```
|
|
|
|
```yaml
|
|
deploy_linux_vm_ansible_user_ssh_key: "{{ lookup('env', 'DEPLOY_VM_SSH_KEY') }}"
|
|
```
|
|
|
|
### SSH Key Generation
|
|
|
|
Generate a dedicated SSH key pair for VM deployment:
|
|
|
|
```bash
|
|
# Generate ED25519 key (recommended)
|
|
ssh-keygen -t ed25519 -C "ansible-automation" -f ~/.ssh/ansible_deploy
|
|
|
|
# Or RSA 4096-bit key
|
|
ssh-keygen -t rsa -b 4096 -C "ansible-automation" -f ~/.ssh/ansible_deploy
|
|
|
|
# Use the public key in your vault file
|
|
cat ~/.ssh/ansible_deploy.pub
|
|
```
|
|
|
|
### Password Generation
|
|
|
|
Generate strong root passwords:
|
|
|
|
```bash
|
|
# Using OpenSSL
|
|
openssl rand -base64 32
|
|
|
|
# Using pwgen
|
|
pwgen -s 32 1
|
|
|
|
# Using /dev/urandom
|
|
tr -dc 'A-Za-z0-9!@#$%^&*' < /dev/urandom | head -c 32
|
|
```
|
|
|
|
### Security Checklist
|
|
|
|
- [ ] SSH keys stored in Ansible Vault or external secret manager
|
|
- [ ] Root passwords stored in Ansible Vault (different per environment)
|
|
- [ ] Vault password file has restricted permissions (0600)
|
|
- [ ] Vault password file is NOT committed to version control (in .gitignore)
|
|
- [ ] Different passwords used for dev/staging/production
|
|
- [ ] SSH keys rotated every 90-180 days
|
|
- [ ] Regular security audits performed
|
|
|
|
## Dependencies
|
|
|
|
None. This role is self-contained.
|
|
|
|
## Example Playbook
|
|
|
|
### Basic Deployment
|
|
|
|
```yaml
|
|
---
|
|
- name: Deploy Linux VM
|
|
hosts: grokbox
|
|
become: yes
|
|
roles:
|
|
- role: deploy_linux_vm
|
|
vars:
|
|
deploy_linux_vm_name: "web-server"
|
|
deploy_linux_vm_os_distribution: "ubuntu-22.04"
|
|
```
|
|
|
|
### Advanced Deployment with Custom LVM
|
|
|
|
```yaml
|
|
---
|
|
- name: Deploy Database Server with Custom Resources
|
|
hosts: grokbox
|
|
become: yes
|
|
roles:
|
|
- role: deploy_linux_vm
|
|
vars:
|
|
deploy_linux_vm_name: "db-server"
|
|
deploy_linux_vm_hostname: "postgres01"
|
|
deploy_linux_vm_domain: "production.local"
|
|
deploy_linux_vm_os_distribution: "almalinux-9"
|
|
deploy_linux_vm_vcpus: 8
|
|
deploy_linux_vm_memory_mb: 16384
|
|
deploy_linux_vm_disk_size_gb: 100
|
|
deploy_linux_vm_use_lvm: true
|
|
deploy_linux_vm_lvm_vg_name: "vg_database"
|
|
```
|
|
|
|
### Multi-VM Deployment
|
|
|
|
```yaml
|
|
---
|
|
- name: Deploy Multiple VMs
|
|
hosts: grokbox
|
|
become: yes
|
|
tasks:
|
|
- name: Deploy web servers
|
|
include_role:
|
|
name: deploy_linux_vm
|
|
vars:
|
|
deploy_linux_vm_name: "{{ item.name }}"
|
|
deploy_linux_vm_hostname: "{{ item.hostname }}"
|
|
deploy_linux_vm_os_distribution: "{{ item.distro }}"
|
|
loop:
|
|
- { name: "web01", hostname: "web01", distro: "ubuntu-22.04" }
|
|
- { name: "web02", hostname: "web02", distro: "ubuntu-22.04" }
|
|
- { name: "db01", hostname: "db01", distro: "almalinux-9" }
|
|
```
|
|
|
|
## Tag-Based Execution
|
|
|
|
Execute specific deployment stages:
|
|
|
|
```bash
|
|
# Pre-flight validation only
|
|
ansible-playbook site.yml --tags validate,preflight
|
|
|
|
# Download cloud images only
|
|
ansible-playbook site.yml --tags download,verify
|
|
|
|
# Deploy VM without LVM configuration
|
|
ansible-playbook site.yml --skip-tags lvm
|
|
|
|
# Configure LVM only (post-deployment)
|
|
ansible-playbook site.yml --tags lvm,post-deploy
|
|
|
|
# Full deployment with all stages
|
|
ansible-playbook site.yml
|
|
```
|
|
|
|
### Available Tags
|
|
|
|
| Tag | Description |
|
|
|-----|-------------|
|
|
| `validate`, `preflight` | Pre-flight validation checks |
|
|
| `install` | Install required packages on hypervisor |
|
|
| `download`, `verify` | Download and verify cloud images |
|
|
| `storage` | Create VM disk storage |
|
|
| `cloud-init` | Generate cloud-init configuration |
|
|
| `deploy` | Deploy and start VM |
|
|
| `lvm`, `post-deploy` | Configure LVM on deployed VM |
|
|
| `cleanup` | Remove temporary files |
|
|
|
|
## LVM Configuration Process
|
|
|
|
The role implements a comprehensive LVM setup:
|
|
|
|
1. **Physical Volume Creation**: Creates PV on `/dev/vdb` (30GB secondary disk)
|
|
2. **Volume Group Setup**: Creates `vg_system` volume group
|
|
3. **Logical Volume Creation**: Creates LVs per CLAUDE.md specifications
|
|
4. **Filesystem Creation**: Formats LVs with ext4/swap
|
|
5. **Data Migration**: Copies existing data from primary disk to LVM volumes
|
|
6. **Fstab Update**: Configures automatic mounting at boot
|
|
7. **Reboot Required**: VM must be rebooted to activate new mounts
|
|
|
|
### LVM Post-Deployment
|
|
|
|
After role execution with LVM enabled:
|
|
|
|
```bash
|
|
# SSH to the VM
|
|
ssh ansible@<VM_IP>
|
|
|
|
# Verify LVM configuration
|
|
sudo vgs
|
|
sudo lvs
|
|
sudo pvs
|
|
|
|
# Check fstab entries
|
|
cat /etc/fstab
|
|
|
|
# Reboot to activate LVM mounts
|
|
sudo reboot
|
|
|
|
# After reboot, verify mounts
|
|
df -h
|
|
lsblk
|
|
```
|
|
|
|
## SSH Hardening
|
|
|
|
The role implements comprehensive SSH hardening per requirements:
|
|
|
|
- **GSSAPI Authentication**: Disabled (`GSSAPIAuthentication no`)
|
|
- **GSSAPI Cleanup**: Disabled (`GSSAPICleanupCredentials no`)
|
|
- **Root Login**: Disabled via SSH (console access available)
|
|
- **Password Authentication**: Disabled (key-based only)
|
|
- **Connection Limits**: Max 3 auth tries, 10 sessions
|
|
- **Keepalive**: 300s interval with 2 max count
|
|
- **Additional Hardening**: Empty passwords rejected, X11 forwarding disabled
|
|
|
|
Configuration file: `/etc/ssh/sshd_config.d/99-security.conf`
|
|
|
|
## Security Features
|
|
|
|
### Debian/Ubuntu Systems
|
|
|
|
- **Firewall**: UFW enabled with SSH allowed
|
|
- **AppArmor**: Enabled and enforcing
|
|
- **Automatic Updates**: `unattended-upgrades` configured for security updates only
|
|
- **Audit**: `auditd` enabled
|
|
- **Time Sync**: `chrony` configured
|
|
|
|
### RHEL/AlmaLinux/Rocky Systems
|
|
|
|
- **Firewall**: `firewalld` enabled with SSH allowed
|
|
- **SELinux**: Enforcing mode enabled
|
|
- **Automatic Updates**: `dnf-automatic` configured for security updates
|
|
- **Audit**: `auditd` enabled
|
|
- **Time Sync**: `chronyd` configured
|
|
|
|
### Essential Packages (CLAUDE.md)
|
|
|
|
All VMs include:
|
|
- System tools: `vim`, `htop`, `tmux`, `jq`, `bc`
|
|
- Network tools: `curl`, `wget`, `rsync`
|
|
- Development: `git`, `python3`, `python3-pip`
|
|
- Security: `aide`, `auditd`, `chrony`
|
|
- Storage: `lvm2`, `parted`
|
|
|
|
## Validation
|
|
|
|
Post-deployment validation includes:
|
|
|
|
- VM running status check
|
|
- IP address assignment verification
|
|
- SSH connectivity test
|
|
- System information gathering
|
|
- LVM configuration verification (if enabled)
|
|
|
|
## Troubleshooting
|
|
|
|
### Cloud-Init Issues
|
|
|
|
```bash
|
|
# Check cloud-init status
|
|
ssh ansible@<VM_IP> "cloud-init status --wait"
|
|
|
|
# View cloud-init logs
|
|
ssh ansible@<VM_IP> "tail -f /var/log/cloud-init-output.log"
|
|
```
|
|
|
|
### LVM Issues
|
|
|
|
```bash
|
|
# Check LVM status on VM
|
|
ssh ansible@<VM_IP> "sudo vgs && sudo lvs && sudo pvs"
|
|
|
|
# Verify fstab
|
|
ssh ansible@<VM_IP> "cat /etc/fstab"
|
|
|
|
# Check disk layout
|
|
ssh ansible@<VM_IP> "lsblk"
|
|
```
|
|
|
|
### SSH Connection Issues
|
|
|
|
```bash
|
|
# Test SSH with ProxyJump
|
|
ssh -J grokbox ansible@<VM_IP>
|
|
|
|
# Verify SSH configuration
|
|
ssh ansible@<VM_IP> "sudo sshd -T | grep -i gssapi"
|
|
```
|
|
|
|
### Firewall Issues
|
|
|
|
```bash
|
|
# Debian/Ubuntu
|
|
ssh ansible@<VM_IP> "sudo ufw status verbose"
|
|
|
|
# RHEL/AlmaLinux
|
|
ssh ansible@<VM_IP> "sudo firewall-cmd --list-all"
|
|
```
|
|
|
|
## File Locations
|
|
|
|
On deployed VMs:
|
|
|
|
- SSH Security Config: `/etc/ssh/sshd_config.d/99-security.conf`
|
|
- Sudoers Config: `/etc/sudoers.d/ansible`
|
|
- Cloud-Init Log: `/var/log/cloud-init-output.log`
|
|
- Fstab: `/etc/fstab` (updated with LVM mounts)
|
|
|
|
On hypervisor:
|
|
|
|
- Cloud Images: `/var/lib/libvirt/images/*.qcow2`
|
|
- VM Disks: `/var/lib/libvirt/images/<vm_name>.qcow2`
|
|
- LVM Disk: `/var/lib/libvirt/images/<vm_name>-lvm.qcow2`
|
|
- Cloud-Init ISO: `/var/lib/libvirt/images/<vm_name>-cloud-init.iso`
|
|
|
|
## License
|
|
|
|
MIT
|
|
|
|
## Author
|
|
|
|
Infrastructure Team
|
|
|
|
## Support
|
|
|
|
- Documentation: `docs/linux-vm-deployment.md`
|
|
- Cheatsheet: `cheatsheets/deploy-linux-vm.md`
|
|
- Guidelines: `CLAUDE.md`
|