Implement critical role improvements per ROLE_ANALYSIS_AND_IMPROVEMENTS.md

This commit addresses the critical issues identified in the role analysis:

## Security Improvements

### Remove Hardcoded Secrets (deploy_linux_vm)
- Replaced hardcoded SSH key in defaults/main.yml with vault variable reference
- Replaced hardcoded root password with vault variable reference
- Created vault.yml.example to document secret structure
- Updated README.md with comprehensive security best practices section
- Added documentation for Ansible Vault, external secret managers, and environment variables
- Included SSH key generation and password generation best practices

## Role Documentation & Planning

### CHANGELOG.md Files
- Created comprehensive CHANGELOG.md for deploy_linux_vm role
  - Documented v1.0.0 initial release features
  - Tracked v1.0.1 security improvements
- Created comprehensive CHANGELOG.md for system_info role
  - Documented v1.0.0 initial release
  - Tracked v1.0.1 critical bug fixes (block-level failed_when, Jinja2 templates, OS variables)

### ROADMAP.md Files
- Created detailed ROADMAP.md for deploy_linux_vm role
  - Version 1.1.0: Security & compliance hardening (Q1 2026)
  - Version 1.2.0: Multi-distribution support (Q2 2026)
  - Version 1.3.0: Advanced features (Q3 2026)
  - Version 2.0.0: Enterprise features (Q4 2026)
- Created detailed ROADMAP.md for system_info role
  - Version 1.1.0: Enhanced monitoring & metrics (Q1 2026)
  - Version 1.2.0: Cloud & container support (Q2 2026)
  - Version 1.3.0: Hardware & firmware deep dive (Q3 2026)
  - Version 2.0.0: Visualization & reporting (Q4 2026)

## Error Handling Enhancements

### deploy_linux_vm Role - Block/Rescue/Always Pattern
- Wrapped deployment tasks in comprehensive error handling block
- Block section:
  - Pre-deployment VM name collision check
  - Enhanced IP address acquisition with better error messages
  - Descriptive failure messages for troubleshooting
- Rescue section (automatic rollback):
  - Diagnostic information gathering
  - VM status checking
  - Attempted console log capture
  - Automatic VM destruction and cleanup
  - Disk image removal (primary, LVM, cloud-init ISO)
  - Detailed troubleshooting guidance
- Always section:
  - Deployment logging to /var/log/ansible-vm-deployments.log
  - Success/failure tracking
- Improved task FQCNs (ansible.builtin.*)

## Handlers Implementation

### deploy_linux_vm Role - Complete Handler Suite
- VM Lifecycle Handlers:
  - restart vm, shutdown vm, destroy vm
- Cloud-Init Handlers:
  - regenerate cloud-init iso (full rebuild and reattach)
- Storage Handlers:
  - refresh libvirt storage pool
  - resize vm disk (with safe shutdown/start)
- Network Handlers:
  - refresh network configuration
  - restart libvirt network
- Libvirt Daemon Handlers:
  - restart libvirtd, reload libvirtd
- Cleanup Handlers:
  - cleanup temporary files
  - remove cloud-init iso
- Validation Handlers:
  - validate vm status
  - check connectivity

## Impact

### Security
- Eliminates hardcoded secrets from version control
- Implements industry best practices for secret management
- Provides clear guidance for secure deployment

### Maintainability
- CHANGELOGs enable version tracking and change auditing
- ROADMAPs provide clear development direction and prioritization
- Comprehensive error handling reduces debugging time
- Handlers enable modular, reusable state management

### Reliability
- Automatic rollback prevents partial deployments
- Comprehensive error messages reduce MTTR
- Handlers ensure consistent state management
- Better separation of concerns

### Compliance
- Aligns with CLAUDE.md security requirements
- Implements proper secrets management per organizational policy
- Provides audit trail through changelogs

## References

- ROLE_ANALYSIS_AND_IMPROVEMENTS.md: Initial analysis document
- CLAUDE.md: Organizational infrastructure standards

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-11 02:21:38 +01:00
parent cfad67a3a1
commit eba1a05e7d
9 changed files with 1138 additions and 67 deletions

View File

@@ -110,8 +110,116 @@ deploy_linux_vm_lvm_volumes:
| Variable | Default | Description |
|----------|---------|-------------|
| `deploy_linux_vm_ansible_user` | ansible | Service account username |
| `deploy_linux_vm_ansible_user_ssh_key` | (default key) | SSH public key for ansible user |
| `deploy_linux_vm_root_password` | ChangeMe123! | Root password (console access) |
| `deploy_linux_vm_ansible_user_ssh_key` | (vault variable) | SSH public key for ansible user |
| `deploy_linux_vm_root_password` | (vault variable) | Root password (console access) |
**SECURITY NOTICE**: SSH keys and passwords should be stored in encrypted vault files, not in defaults.
## Security Best Practices
### Secrets Management
This role requires sensitive data (SSH keys, passwords) to be stored securely:
#### Option 1: Ansible Vault (Recommended for Small/Medium Deployments)
1. Create a vault file in your inventory:
```bash
# Create encrypted vault file
ansible-vault create inventories/production/group_vars/all/vault.yml
```
2. Add the required vault variables:
```yaml
---
# SSH public key for ansible user
vault_deploy_linux_vm_ansible_user_ssh_key: "ssh-ed25519 AAAAC3... ansible@automation"
# Root password for emergency console access
vault_deploy_linux_vm_root_password: "YourSecurePassword123!"
```
3. Reference vault variables in your playbook or group_vars:
```yaml
# inventories/production/group_vars/all/vars.yml
deploy_linux_vm_ansible_user_ssh_key: "{{ vault_deploy_linux_vm_ansible_user_ssh_key }}"
deploy_linux_vm_root_password: "{{ vault_deploy_linux_vm_root_password }}"
```
4. Run playbooks with vault password:
```bash
ansible-playbook site.yml --ask-vault-pass
# Or use a password file
ansible-playbook site.yml --vault-password-file ~/.vault_pass
```
#### Option 2: External Secret Managers (Recommended for Enterprise)
- **HashiCorp Vault**: Use `community.hashi_vault.vault_read` lookup plugin
- **AWS Secrets Manager**: Use `amazon.aws.aws_secret` lookup plugin
- **Azure Key Vault**: Use `azure.azcollection.azure_keyvault_secret` lookup plugin
- **CyberArk**: Use CyberArk Ansible plugins
Example with HashiCorp Vault:
```yaml
deploy_linux_vm_ansible_user_ssh_key: "{{ lookup('community.hashi_vault.vault_read', 'secret/data/ansible/ssh_key').data.public_key }}"
```
#### Option 3: Environment Variables
```bash
export ANSIBLE_VAULT_PASSWORD_FILE=~/.vault_pass
export DEPLOY_VM_SSH_KEY="ssh-ed25519 AAAAC3..."
```
```yaml
deploy_linux_vm_ansible_user_ssh_key: "{{ lookup('env', 'DEPLOY_VM_SSH_KEY') }}"
```
### SSH Key Generation
Generate a dedicated SSH key pair for VM deployment:
```bash
# Generate ED25519 key (recommended)
ssh-keygen -t ed25519 -C "ansible-automation" -f ~/.ssh/ansible_deploy
# Or RSA 4096-bit key
ssh-keygen -t rsa -b 4096 -C "ansible-automation" -f ~/.ssh/ansible_deploy
# Use the public key in your vault file
cat ~/.ssh/ansible_deploy.pub
```
### Password Generation
Generate strong root passwords:
```bash
# Using OpenSSL
openssl rand -base64 32
# Using pwgen
pwgen -s 32 1
# Using /dev/urandom
tr -dc 'A-Za-z0-9!@#$%^&*' < /dev/urandom | head -c 32
```
### Security Checklist
- [ ] SSH keys stored in Ansible Vault or external secret manager
- [ ] Root passwords stored in Ansible Vault (different per environment)
- [ ] Vault password file has restricted permissions (0600)
- [ ] Vault password file is NOT committed to version control (in .gitignore)
- [ ] Different passwords used for dev/staging/production
- [ ] SSH keys rotated every 90-180 days
- [ ] Regular security audits performed
## Dependencies