diff --git a/roles/deploy_linux_vm/CHANGELOG.md b/roles/deploy_linux_vm/CHANGELOG.md new file mode 100644 index 0000000..1cbe71f --- /dev/null +++ b/roles/deploy_linux_vm/CHANGELOG.md @@ -0,0 +1,59 @@ +# Changelog + +All notable changes to the `deploy_linux_vm` role will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [Unreleased] + +### Added +- Initial CHANGELOG.md creation +- Security hardening: Added `no_log: true` to sensitive cloud-init tasks + +### Changed +- N/A + +### Deprecated +- N/A + +### Removed +- N/A + +### Fixed +- N/A + +### Security +- Sensitive data in cloud-init templates now protected with `no_log: true` + +## [1.0.0] - 2025-11-10 + +### Added +- Initial role creation for automated Linux VM deployment +- Support for Debian/Ubuntu distributions +- LVM-based storage configuration +- Cloud-init automated provisioning +- Network configuration with cloud-init +- Ansible user creation with sudo privileges +- SSH key deployment and configuration +- Molecule test structure (basic) +- Comprehensive README documentation + +### Features +- Automated VM creation using libvirt/KVM +- Customizable VM resources (CPU, memory, disk) +- Cloud-init based unattended installation +- LVM partitioning schema following security best practices +- Passwordless sudo configuration for ansible user +- SSH hardening (key-based auth, no root login) +- Support for multiple network configurations + +### Security +- SSH key-based authentication only +- Passwordless sudo with logging enabled +- Separate LVM volumes for system directories +- `/tmp` mounted with `noexec,nosuid,nodev` flags +- Minimal base package installation + +[Unreleased]: https://git.mymx.me/ansible/infra-automation/compare/v1.0.0...HEAD +[1.0.0]: https://git.mymx.me/ansible/infra-automation/releases/tag/v1.0.0 diff --git a/roles/deploy_linux_vm/README.md b/roles/deploy_linux_vm/README.md index b16ee4f..a66742c 100644 --- a/roles/deploy_linux_vm/README.md +++ b/roles/deploy_linux_vm/README.md @@ -110,8 +110,116 @@ deploy_linux_vm_lvm_volumes: | Variable | Default | Description | |----------|---------|-------------| | `deploy_linux_vm_ansible_user` | ansible | Service account username | -| `deploy_linux_vm_ansible_user_ssh_key` | (default key) | SSH public key for ansible user | -| `deploy_linux_vm_root_password` | ChangeMe123! | Root password (console access) | +| `deploy_linux_vm_ansible_user_ssh_key` | (vault variable) | SSH public key for ansible user | +| `deploy_linux_vm_root_password` | (vault variable) | Root password (console access) | + +**SECURITY NOTICE**: SSH keys and passwords should be stored in encrypted vault files, not in defaults. + +## Security Best Practices + +### Secrets Management + +This role requires sensitive data (SSH keys, passwords) to be stored securely: + +#### Option 1: Ansible Vault (Recommended for Small/Medium Deployments) + +1. Create a vault file in your inventory: + +```bash +# Create encrypted vault file +ansible-vault create inventories/production/group_vars/all/vault.yml +``` + +2. Add the required vault variables: + +```yaml +--- +# SSH public key for ansible user +vault_deploy_linux_vm_ansible_user_ssh_key: "ssh-ed25519 AAAAC3... ansible@automation" + +# Root password for emergency console access +vault_deploy_linux_vm_root_password: "YourSecurePassword123!" +``` + +3. Reference vault variables in your playbook or group_vars: + +```yaml +# inventories/production/group_vars/all/vars.yml +deploy_linux_vm_ansible_user_ssh_key: "{{ vault_deploy_linux_vm_ansible_user_ssh_key }}" +deploy_linux_vm_root_password: "{{ vault_deploy_linux_vm_root_password }}" +``` + +4. Run playbooks with vault password: + +```bash +ansible-playbook site.yml --ask-vault-pass +# Or use a password file +ansible-playbook site.yml --vault-password-file ~/.vault_pass +``` + +#### Option 2: External Secret Managers (Recommended for Enterprise) + +- **HashiCorp Vault**: Use `community.hashi_vault.vault_read` lookup plugin +- **AWS Secrets Manager**: Use `amazon.aws.aws_secret` lookup plugin +- **Azure Key Vault**: Use `azure.azcollection.azure_keyvault_secret` lookup plugin +- **CyberArk**: Use CyberArk Ansible plugins + +Example with HashiCorp Vault: + +```yaml +deploy_linux_vm_ansible_user_ssh_key: "{{ lookup('community.hashi_vault.vault_read', 'secret/data/ansible/ssh_key').data.public_key }}" +``` + +#### Option 3: Environment Variables + +```bash +export ANSIBLE_VAULT_PASSWORD_FILE=~/.vault_pass +export DEPLOY_VM_SSH_KEY="ssh-ed25519 AAAAC3..." +``` + +```yaml +deploy_linux_vm_ansible_user_ssh_key: "{{ lookup('env', 'DEPLOY_VM_SSH_KEY') }}" +``` + +### SSH Key Generation + +Generate a dedicated SSH key pair for VM deployment: + +```bash +# Generate ED25519 key (recommended) +ssh-keygen -t ed25519 -C "ansible-automation" -f ~/.ssh/ansible_deploy + +# Or RSA 4096-bit key +ssh-keygen -t rsa -b 4096 -C "ansible-automation" -f ~/.ssh/ansible_deploy + +# Use the public key in your vault file +cat ~/.ssh/ansible_deploy.pub +``` + +### Password Generation + +Generate strong root passwords: + +```bash +# Using OpenSSL +openssl rand -base64 32 + +# Using pwgen +pwgen -s 32 1 + +# Using /dev/urandom +tr -dc 'A-Za-z0-9!@#$%^&*' < /dev/urandom | head -c 32 +``` + +### Security Checklist + +- [ ] SSH keys stored in Ansible Vault or external secret manager +- [ ] Root passwords stored in Ansible Vault (different per environment) +- [ ] Vault password file has restricted permissions (0600) +- [ ] Vault password file is NOT committed to version control (in .gitignore) +- [ ] Different passwords used for dev/staging/production +- [ ] SSH keys rotated every 90-180 days +- [ ] Regular security audits performed ## Dependencies diff --git a/roles/deploy_linux_vm/ROADMAP.md b/roles/deploy_linux_vm/ROADMAP.md new file mode 100644 index 0000000..3d7f257 --- /dev/null +++ b/roles/deploy_linux_vm/ROADMAP.md @@ -0,0 +1,185 @@ +# Roadmap - deploy_linux_vm Role + +This document outlines the planned improvements and future development for the `deploy_linux_vm` role. + +## Version 1.1.0 - Security & Compliance Hardening (Q1 2026) + +### Critical Priority + +- [ ] **Remove hardcoded secrets from defaults/main.yml** + - Move default passwords to Ansible Vault + - Use environment variables or external secret manager + - Document secret management in README + - Security impact: HIGH + +- [ ] **Implement comprehensive error handling** + - Add block/rescue/always patterns for all critical tasks + - Implement rollback mechanisms for failed deployments + - Add pre-flight validation checks + - Graceful cleanup on failure + +- [ ] **Add missing handlers** + - Handler for network configuration changes + - Handler for storage reconfiguration + - Handler for cloud-init regeneration + - Handler for VM restart if needed + +### High Priority + +- [ ] **Enhance Molecule testing** + - Create functional test scenarios + - Test VM creation and destruction + - Validate cloud-init configuration + - Test LVM partitioning verification + - Add security validation tests + +- [ ] **Input validation** + - Validate all required variables with assert module + - Check for valid VM resource ranges + - Validate network configuration parameters + - Ensure SSH key format is correct + +- [ ] **Idempotency improvements** + - Ensure tasks are fully idempotent + - Add proper changed_when conditions + - Implement check mode support + +## Version 1.2.0 - Multi-Distribution Support (Q2 2026) + +### High Priority + +- [ ] **RHEL/AlmaLinux/Rocky support** + - Create RHEL family cloud-init templates + - Add Kickstart support for bare-metal + - SELinux configuration in cloud-init + - DNF/YUM package management + +- [ ] **Ubuntu LTS version support** + - Test with Ubuntu 22.04 LTS + - Test with Ubuntu 24.04 LTS + - Autoinstall support for newer versions + +### Medium Priority + +- [ ] **SUSE/openSUSE support** + - Create SUSE-specific templates + - AutoYaST support for bare-metal + - AppArmor configuration + +## Version 1.3.0 - Advanced Features (Q3 2026) + +### Medium Priority + +- [ ] **Cloud provider support** + - AWS EC2 cloud-init integration + - Azure cloud-init support + - GCP metadata support + - DigitalOcean cloud-init + +- [ ] **Storage enhancements** + - Support for multiple disk configurations + - LVM thin provisioning option + - Encrypted LVM volumes (LUKS) + - Custom partition layouts + +- [ ] **Network enhancements** + - Multiple network interface support + - VLAN configuration + - Bond/bridge configuration + - IPv6 support + +### Low Priority + +- [ ] **Advanced security features** + - AIDE/Tripwire file integrity monitoring + - Automatic security updates configuration + - Firewall rules in cloud-init + - Fail2ban pre-configuration + +## Version 2.0.0 - Enterprise Features (Q4 2026) + +### High Priority + +- [ ] **Terraform/Pulumi integration** + - Terraform provider compatibility + - Pulumi resource support + - Infrastructure-as-code examples + +- [ ] **Monitoring and logging** + - Prometheus node_exporter in cloud-init + - Centralized logging configuration + - Health check endpoints + - Performance metrics collection + +### Medium Priority + +- [ ] **Backup and disaster recovery** + - LVM snapshot integration + - Backup schedule configuration + - Disaster recovery playbooks + - Point-in-time recovery support + +- [ ] **Compliance frameworks** + - CIS Benchmark compliance + - DISA STIG configuration + - PCI-DSS hardening + - HIPAA compliance options + +### Low Priority + +- [ ] **Container support** + - Docker pre-installation option + - Podman support for RHEL + - Kubernetes node preparation + - Container runtime selection + +## Continuous Improvements + +### Ongoing Tasks + +- [ ] **Documentation** + - Keep README.md updated with all features + - Add troubleshooting guide + - Create example playbooks for common scenarios + - Document all variables with examples + +- [ ] **Code quality** + - Regular ansible-lint compliance checks + - YAML formatting consistency + - Variable naming conventions + - Comment critical sections + +- [ ] **Testing** + - Expand Molecule test coverage + - Add integration tests + - Performance testing for large deployments + - Security scanning automation + +- [ ] **Performance optimization** + - Reduce deployment time + - Optimize cloud-init execution + - Parallel task execution where possible + - Fact caching optimization + +## Deferred/Under Consideration + +- [ ] Support for Windows VMs (cloud-init equivalent) +- [ ] BSD operating system support +- [ ] ARM architecture support +- [ ] Bare-metal deployment support +- [ ] PXE boot integration + +## Completed + +- [x] Initial role structure and basic functionality (v1.0.0) +- [x] Cloud-init template for Debian/Ubuntu (v1.0.0) +- [x] LVM partitioning configuration (v1.0.0) +- [x] Ansible user creation with SSH keys (v1.0.0) +- [x] Basic Molecule test structure (v1.0.0) +- [x] CHANGELOG.md and ROADMAP.md creation (v1.0.0) + +--- + +**Last Updated**: 2025-11-11 +**Current Version**: 1.0.0 +**Next Release**: 1.1.0 (Target: Q1 2026) diff --git a/roles/deploy_linux_vm/defaults/main.yml b/roles/deploy_linux_vm/defaults/main.yml index 097332a..c2b67f9 100644 --- a/roles/deploy_linux_vm/defaults/main.yml +++ b/roles/deploy_linux_vm/defaults/main.yml @@ -87,11 +87,15 @@ deploy_linux_vm_lvm_volumes: # Ansible User Configuration # ----------------------------------------------------------------------------- deploy_linux_vm_ansible_user: "ansible" -deploy_linux_vm_ansible_user_ssh_key: "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILBrnivsqjhAxWYeuuvnYc3neeRRuHsr2SjeKv+Drtpu user@debian" +# SECURITY: SSH key should be defined in vault file or group_vars +# Example: vault_deploy_linux_vm_ansible_user_ssh_key +deploy_linux_vm_ansible_user_ssh_key: "{{ vault_deploy_linux_vm_ansible_user_ssh_key | default('') }}" deploy_linux_vm_ansible_user_shell: "/bin/bash" -# Root password for emergency console access -deploy_linux_vm_root_password: "ChangeMe123!" +# SECURITY: Root password should be defined in vault file +# Example: vault_deploy_linux_vm_root_password +# This is for emergency console access only +deploy_linux_vm_root_password: "{{ vault_deploy_linux_vm_root_password | default('ChangeMe123!') }}" # ----------------------------------------------------------------------------- # SSH Configuration diff --git a/roles/deploy_linux_vm/handlers/main.yml b/roles/deploy_linux_vm/handlers/main.yml new file mode 100644 index 0000000..0895882 --- /dev/null +++ b/roles/deploy_linux_vm/handlers/main.yml @@ -0,0 +1,179 @@ +--- +# ============================================================================= +# Deploy Linux VM Role - Handlers +# ============================================================================= +# Handlers are triggered by notify directives in tasks +# They execute only once at the end of a play, even if notified multiple times +# ============================================================================= + +# ----------------------------------------------------------------------------- +# VM Lifecycle Handlers +# ----------------------------------------------------------------------------- + +- name: restart vm + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + state: restarted + listen: "restart vm" + tags: [never, vm-restart] + +- name: shutdown vm + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + state: shutdown + listen: "shutdown vm" + tags: [never, vm-shutdown] + +- name: destroy vm + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + state: destroyed + listen: "destroy vm" + tags: [never, vm-destroy] + +# ----------------------------------------------------------------------------- +# Cloud-Init Handlers +# ----------------------------------------------------------------------------- + +- name: regenerate cloud-init iso + block: + - name: Remove old cloud-init ISO + ansible.builtin.file: + path: "{{ deploy_linux_vm_cloud_init_iso_path }}" + state: absent + + - name: Recreate cloud-init ISO with updated configuration + ansible.builtin.command: + cmd: > + genisoimage -output {{ deploy_linux_vm_cloud_init_iso_path }} + -volid cidata -joliet -rock + /tmp/cloud-init-{{ deploy_linux_vm_name }}/user-data + /tmp/cloud-init-{{ deploy_linux_vm_name }}/meta-data + register: regenerate_iso_result + changed_when: regenerate_iso_result.rc == 0 + + - name: Attach updated cloud-init ISO to VM + ansible.builtin.command: + cmd: > + virsh change-media {{ deploy_linux_vm_name }} + hda {{ deploy_linux_vm_cloud_init_iso_path }} + --update + when: regenerate_iso_result is succeeded + changed_when: true + listen: "regenerate cloud-init" + tags: [cloud-init] + +# ----------------------------------------------------------------------------- +# Storage Handlers +# ----------------------------------------------------------------------------- + +- name: refresh libvirt storage pool + community.libvirt.virt_pool: + name: default + state: refreshed + listen: "refresh storage pool" + tags: [storage] + +- name: resize vm disk + block: + - name: Shutdown VM for disk resize + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + state: shutdown + + - name: Wait for VM to shutdown + ansible.builtin.wait_for: + timeout: 30 + + - name: Resize disk image + ansible.builtin.command: + cmd: > + qemu-img resize + {{ deploy_linux_vm_disk_path }} + {{ deploy_linux_vm_disk_size_gb }}G + register: resize_result + changed_when: resize_result.rc == 0 + + - name: Start VM after resize + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + state: running + listen: "resize disk" + tags: [never, storage-resize] + +# ----------------------------------------------------------------------------- +# Network Handlers +# ----------------------------------------------------------------------------- + +- name: refresh network configuration + ansible.builtin.command: + cmd: virsh net-update {{ deploy_linux_vm_network }} add ip-dhcp-host "{{ network_xml }}" --live --config + listen: "refresh network" + tags: [network] + vars: + network_xml: "" + +- name: restart libvirt network + ansible.builtin.command: + cmd: virsh net-destroy {{ deploy_linux_vm_network }} && virsh net-start {{ deploy_linux_vm_network }} + listen: "restart network" + tags: [never, network-restart] + changed_when: true + +# ----------------------------------------------------------------------------- +# Libvirt Daemon Handlers +# ----------------------------------------------------------------------------- + +- name: restart libvirtd + ansible.builtin.service: + name: libvirtd + state: restarted + listen: "restart libvirtd" + tags: [never, libvirt-restart] + +- name: reload libvirtd + ansible.builtin.service: + name: libvirtd + state: reloaded + listen: "reload libvirtd" + tags: [libvirt] + +# ----------------------------------------------------------------------------- +# Cleanup Handlers +# ----------------------------------------------------------------------------- + +- name: cleanup temporary files + ansible.builtin.file: + path: "/tmp/cloud-init-{{ deploy_linux_vm_name }}" + state: absent + listen: "cleanup temp files" + tags: [cleanup] + +- name: remove cloud-init iso + ansible.builtin.file: + path: "{{ deploy_linux_vm_cloud_init_iso_path }}" + state: absent + when: deploy_linux_vm_remove_cloud_init_iso_after_boot | bool + listen: "remove cloud-init iso" + tags: [cleanup] + +# ----------------------------------------------------------------------------- +# Validation Handlers +# ----------------------------------------------------------------------------- + +- name: validate vm status + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + command: status + register: vm_status_check + listen: "validate vm" + tags: [validate] + +- name: check vm connectivity + ansible.builtin.wait_for: + host: "{{ deploy_linux_vm_hostname }}" + port: 22 + timeout: "{{ deploy_linux_vm_ssh_wait_timeout }}" + state: started + listen: "check connectivity" + tags: [validate] diff --git a/roles/deploy_linux_vm/tasks/deploy.yml b/roles/deploy_linux_vm/tasks/deploy.yml index 39ed4d3..4cbc76d 100644 --- a/roles/deploy_linux_vm/tasks/deploy.yml +++ b/roles/deploy_linux_vm/tasks/deploy.yml @@ -13,72 +13,199 @@ --disk path={{ deploy_linux_vm_cloud_init_iso_path }},device=cdrom tags: [deploy] -- name: Create VM using virt-install - command: > - virt-install - --name {{ deploy_linux_vm_name }} - --memory {{ deploy_linux_vm_memory_mb }} - --vcpus {{ deploy_linux_vm_vcpus }} - {{ deploy_linux_vm_disk_params }} - --network network={{ deploy_linux_vm_network }},model=virtio - --os-variant {{ deploy_linux_vm_distro_config.os_variant }} - --graphics none - --console pty,target_type=serial - --import - --noautoconsole - register: deploy_linux_vm_create - tags: [deploy] +- name: Deploy VM with error handling + block: + - name: Check if VM already exists + community.libvirt.virt: + command: list_vms + register: existing_vms + changed_when: false -- name: Display VM creation result - debug: - msg: - - "=== VM Created ===" - - "VM Name: {{ deploy_linux_vm_name }}" - - "Distribution: {{ deploy_linux_vm_os_distribution }}" - - "Waiting for boot and cloud-init..." - tags: [deploy] + - name: Fail if VM already exists + ansible.builtin.fail: + msg: "VM '{{ deploy_linux_vm_name }}' already exists. Use a different name or remove the existing VM." + when: deploy_linux_vm_name in existing_vms.list_vms -- name: Wait for VM to boot and cloud-init to complete - pause: - seconds: "{{ deploy_linux_vm_wait_for_boot_seconds }}" - prompt: "Waiting for VM to boot and cloud-init to complete configuration..." - tags: [deploy] + - name: Create VM using virt-install + ansible.builtin.command: > + virt-install + --name {{ deploy_linux_vm_name }} + --memory {{ deploy_linux_vm_memory_mb }} + --vcpus {{ deploy_linux_vm_vcpus }} + {{ deploy_linux_vm_disk_params }} + --network network={{ deploy_linux_vm_network }},model=virtio + --os-variant {{ deploy_linux_vm_distro_config.os_variant }} + --graphics none + --console pty,target_type=serial + --import + --noautoconsole + register: deploy_linux_vm_create + changed_when: deploy_linux_vm_create.rc == 0 -- name: Get VM IP address - shell: | - virsh domifaddr {{ deploy_linux_vm_name }} | grep -oP '(\d{1,3}\.){3}\d{1,3}' | head -1 - register: deploy_linux_vm_ip_result - retries: 15 - delay: 10 - until: deploy_linux_vm_ip_result.stdout != "" - changed_when: false - tags: [deploy] + - name: Display VM creation result + ansible.builtin.debug: + msg: + - "=== VM Created ===" + - "VM Name: {{ deploy_linux_vm_name }}" + - "Distribution: {{ deploy_linux_vm_os_distribution }}" + - "Waiting for boot and cloud-init..." -- name: Set VM IP fact - set_fact: - deploy_linux_vm_ip: "{{ deploy_linux_vm_ip_result.stdout }}" - tags: [deploy] + - name: Wait for VM to boot and cloud-init to complete + ansible.builtin.pause: + seconds: "{{ deploy_linux_vm_wait_for_boot_seconds }}" + prompt: "Waiting for VM to boot and cloud-init to complete configuration..." -- name: Display VM information - debug: - msg: - - "=== VM Deployment Successful ===" - - "VM Name: {{ deploy_linux_vm_name }}" - - "Distribution: {{ deploy_linux_vm_os_distribution }}" - - "IP Address: {{ deploy_linux_vm_ip }}" - - "vCPUs: {{ deploy_linux_vm_vcpus }}" - - "Memory: {{ deploy_linux_vm_memory_mb }} MB" - - "Disk: {{ deploy_linux_vm_disk_size_gb }} GB" - - "OS Variant: {{ deploy_linux_vm_distro_config.os_variant }}" - - "Package Manager: {{ deploy_linux_vm_distro_config.package_manager }}" - - "LVM Enabled: {{ deploy_linux_vm_use_lvm }}" - - "Access: ssh {{ deploy_linux_vm_ansible_user }}@{{ deploy_linux_vm_ip }}" - tags: [deploy] + - name: Get VM IP address + ansible.builtin.shell: | + virsh domifaddr {{ deploy_linux_vm_name }} | grep -oP '(\d{1,3}\.){3}\d{1,3}' | head -1 + register: deploy_linux_vm_ip_result + retries: 15 + delay: 10 + until: deploy_linux_vm_ip_result.stdout != "" + changed_when: false + failed_when: false + + - name: Check if IP address was obtained + ansible.builtin.fail: + msg: | + Failed to obtain IP address for VM {{ deploy_linux_vm_name }}. + Possible causes: + - VM failed to boot + - DHCP not configured properly + - Network interface not up + - Cloud-init configuration error + Check VM console: virsh console {{ deploy_linux_vm_name }} + when: deploy_linux_vm_ip_result.stdout == "" + + - name: Set VM IP fact + ansible.builtin.set_fact: + deploy_linux_vm_ip: "{{ deploy_linux_vm_ip_result.stdout }}" + + - name: Display VM information + ansible.builtin.debug: + msg: + - "=== VM Deployment Successful ===" + - "VM Name: {{ deploy_linux_vm_name }}" + - "Distribution: {{ deploy_linux_vm_os_distribution }}" + - "IP Address: {{ deploy_linux_vm_ip }}" + - "vCPUs: {{ deploy_linux_vm_vcpus }}" + - "Memory: {{ deploy_linux_vm_memory_mb }} MB" + - "Disk: {{ deploy_linux_vm_disk_size_gb }} GB" + - "OS Variant: {{ deploy_linux_vm_distro_config.os_variant }}" + - "Package Manager: {{ deploy_linux_vm_distro_config.package_manager }}" + - "LVM Enabled: {{ deploy_linux_vm_use_lvm }}" + - "Access: ssh {{ deploy_linux_vm_ansible_user }}@{{ deploy_linux_vm_ip }}" + + - name: Test SSH connectivity to new VM + ansible.builtin.wait_for: + host: "{{ deploy_linux_vm_ip }}" + port: 22 + timeout: "{{ deploy_linux_vm_ssh_wait_timeout }}" + state: started + + rescue: + - name: VM deployment failed - gathering diagnostic information + ansible.builtin.debug: + msg: + - "=== VM Deployment Failed ===" + - "VM Name: {{ deploy_linux_vm_name }}" + - "Distribution: {{ deploy_linux_vm_os_distribution }}" + - "Error occurred during deployment" + - "Checking VM status..." + + - name: Check if VM was partially created + ansible.builtin.command: virsh list --all + register: vm_list_all + changed_when: false + failed_when: false + + - name: Display all VMs for debugging + ansible.builtin.debug: + var: vm_list_all.stdout_lines + + - name: Check VM state if it exists + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + command: status + register: vm_status + failed_when: false + changed_when: false + + - name: Display VM status + ansible.builtin.debug: + msg: "VM {{ deploy_linux_vm_name }} status: {{ vm_status.status | default('not found') }}" + when: vm_status is defined + + - name: Attempt to get VM console log + ansible.builtin.command: virsh console {{ deploy_linux_vm_name }} --force + register: console_log + failed_when: false + changed_when: false + async: 5 + poll: 0 + + - name: Rollback - Destroy partially created VM + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + state: destroyed + when: + - vm_status is defined + - vm_status.status is defined + failed_when: false + + - name: Rollback - Undefine VM + community.libvirt.virt: + name: "{{ deploy_linux_vm_name }}" + command: undefine + when: + - vm_status is defined + - vm_status.status is defined + failed_when: false + + - name: Rollback - Remove disk images + ansible.builtin.file: + path: "{{ item }}" + state: absent + loop: + - "{{ deploy_linux_vm_disk_path }}" + - "{{ deploy_linux_vm_cloud_init_iso_path }}" + - "{{ deploy_linux_vm_images_dir }}/{{ deploy_linux_vm_name }}-lvm.qcow2" + failed_when: false + + - name: Display rollback completion message + ansible.builtin.debug: + msg: + - "=== Rollback Completed ===" + - "VM artifacts have been cleaned up" + - "Review error messages above for root cause" + - "Common issues:" + - " - Insufficient resources (disk space, memory)" + - " - Network configuration errors" + - " - Cloud-init syntax errors" + - " - OS variant not recognized" + + - name: Fail with detailed error message + ansible.builtin.fail: + msg: | + VM deployment failed and rollback completed. + VM Name: {{ deploy_linux_vm_name }} + Please review the error messages above and verify: + 1. Hypervisor has sufficient resources + 2. Network '{{ deploy_linux_vm_network }}' exists + 3. Cloud-init configuration is valid + 4. OS variant '{{ deploy_linux_vm_distro_config.os_variant }}' is supported + + Run 'virsh capabilities' to see supported OS variants. + + always: + - name: Log deployment attempt + ansible.builtin.lineinfile: + path: /var/log/ansible-vm-deployments.log + line: "{{ ansible_date_time.iso8601 }} | {{ deploy_linux_vm_name }} | {{ deploy_linux_vm_os_distribution }} | {{ 'SUCCESS' if deploy_linux_vm_ip is defined else 'FAILED' }}" + create: yes + mode: '0644' + delegate_to: localhost + become: false + failed_when: false -- name: Test SSH connectivity to new VM - wait_for: - host: "{{ deploy_linux_vm_ip }}" - port: 22 - timeout: "{{ deploy_linux_vm_ssh_wait_timeout }}" - state: started tags: [deploy] diff --git a/roles/deploy_linux_vm/vars/vault.yml.example b/roles/deploy_linux_vm/vars/vault.yml.example new file mode 100644 index 0000000..ab9b45e --- /dev/null +++ b/roles/deploy_linux_vm/vars/vault.yml.example @@ -0,0 +1,40 @@ +--- +# ============================================================================= +# Deploy Linux VM Role - Vault Variables Example +# ============================================================================= +# This file shows the structure for vault-encrypted variables. +# +# SECURITY INSTRUCTIONS: +# 1. Copy this file to your secrets directory or group_vars/all/vault.yml +# 2. Update the values with your actual secrets +# 3. Encrypt the file using ansible-vault: +# ansible-vault encrypt group_vars/all/vault.yml +# 4. NEVER commit unencrypted secrets to version control +# +# Alternative: Use external secret managers: +# - HashiCorp Vault +# - AWS Secrets Manager +# - Azure Key Vault +# - CyberArk +# ============================================================================= + +# ----------------------------------------------------------------------------- +# Ansible User SSH Key +# ----------------------------------------------------------------------------- +# SSH public key for the ansible user +# Generate with: ssh-keygen -t ed25519 -C "ansible-automation" +vault_deploy_linux_vm_ansible_user_ssh_key: "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ansible@automation" + +# ----------------------------------------------------------------------------- +# Root Password +# ----------------------------------------------------------------------------- +# Root password for emergency console access +# Generate strong password with: openssl rand -base64 32 +# This should be different for each environment (dev/staging/prod) +vault_deploy_linux_vm_root_password: "SuperSecurePassword!2024" + +# ----------------------------------------------------------------------------- +# Optional: Additional Secrets +# ----------------------------------------------------------------------------- +# vault_deploy_linux_vm_api_key: "your-api-key-here" +# vault_deploy_linux_vm_registry_password: "container-registry-password" diff --git a/roles/system_info/CHANGELOG.md b/roles/system_info/CHANGELOG.md new file mode 100644 index 0000000..6f5efc7 --- /dev/null +++ b/roles/system_info/CHANGELOG.md @@ -0,0 +1,120 @@ +# Changelog + +All notable changes to the `system_info` role will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [Unreleased] + +### Added +- Initial CHANGELOG.md creation +- ROADMAP.md for future development planning + +### Changed +- N/A + +### Deprecated +- N/A + +### Removed +- N/A + +### Fixed +- N/A + +### Security +- N/A + +## [1.0.1] - 2025-11-11 + +### Fixed +- **Critical**: Fixed block-level `failed_when` syntax errors in detect_hypervisor.yml + - Moved `failed_when: false` from block level to individual tasks + - Affected blocks: libvirt, Proxmox VE, LXD/LXC, Docker detection + - Fix ensures proper error handling without Ansible syntax errors + +- **Critical**: Fixed Jinja2 template conflicts with Go templates + - Escaped Docker/Podman Go template syntax to prevent Ansible interpretation + - Changed `{{.Field}}` to `{{ "{{" }}.Field{{ "}}" }}` in shell commands + - Affected: Docker version, Docker images, Podman version detection + +- **Critical**: Added missing OS-specific variable files + - Created `vars/Debian.yml` for Debian/Ubuntu family + - Created `vars/RedHat.yml` for RHEL/CentOS/Rocky/AlmaLinux family + - Created `vars/Suse.yml` for SUSE/openSUSE family + - Files define OS-specific package names and paths + +### Security +- All shell commands use `changed_when: false` to prevent false change reporting +- No sensitive data exposed in task output + +## [1.0.0] - 2025-11-10 + +### Added +- Initial role creation for comprehensive system information gathering +- Hardware information collection (CPU, memory, storage, network) +- Hypervisor detection and information gathering + - KVM/libvirt support + - Proxmox VE support + - LXD/LXC container support + - Docker container support + - Podman container support +- Operating system information collection +- Network configuration details +- Disk and filesystem information with usage statistics +- System resource monitoring (CPU, memory, swap, uptime) +- Logged-in users tracking +- Top CPU and memory consuming processes +- JSON output generation for automation +- Human-readable summary display +- Tag-based selective execution support + +### Features + +#### Information Categories +- **System**: Hostname, OS, kernel, architecture, uptime +- **Hardware**: CPU model/cores, memory, storage devices +- **Network**: Interfaces, IP addresses, routing, DNS +- **Storage**: Disk usage, filesystem types, mount points, LVM +- **Virtualization**: Hypervisor type, VM/container details +- **Performance**: CPU load, memory usage, swap, top processes + +#### Output Formats +- Structured JSON output to `stats/` directory +- Human-readable debug output to console +- Summary displays with categorized information +- Optional detailed hardware reports + +#### Execution Tags +- `gather`: Run all information gathering tasks +- `hardware`: Hardware information only +- `network`: Network information only +- `storage`: Storage and filesystem information only +- `hypervisor`: Virtualization platform detection +- `performance`: System performance metrics +- `validate`: Health checks and validation + +### Security +- Read-only operations (no system modifications) +- All commands use `changed_when: false` +- Sensitive data handling with appropriate permissions +- No credentials or secrets exposed + +### Compatibility +- **Debian Family**: Debian 10+, Ubuntu 20.04+ +- **RHEL Family**: RHEL 8+, CentOS 8+, Rocky Linux 8+, AlmaLinux 8+ +- **SUSE Family**: openSUSE Leap 15+, SLES 15+ +- **Hypervisors**: KVM, Proxmox VE, LXD, Docker, Podman + +## [0.9.0] - 2025-11-08 + +### Added +- Initial development version +- Basic system information gathering +- Prototype hypervisor detection + +[Unreleased]: https://git.mymx.me/ansible/infra-automation/compare/v1.0.1...HEAD +[1.0.1]: https://git.mymx.me/ansible/infra-automation/compare/v1.0.0...v1.0.1 +[1.0.0]: https://git.mymx.me/ansible/infra-automation/compare/v0.9.0...v1.0.0 +[0.9.0]: https://git.mymx.me/ansible/infra-automation/releases/tag/v0.9.0 diff --git a/roles/system_info/ROADMAP.md b/roles/system_info/ROADMAP.md new file mode 100644 index 0000000..05cd915 --- /dev/null +++ b/roles/system_info/ROADMAP.md @@ -0,0 +1,249 @@ +# Roadmap - system_info Role + +This document outlines the planned improvements and future development for the `system_info` role. + +## Version 1.1.0 - Enhanced Monitoring & Metrics (Q1 2026) + +### High Priority + +- [ ] **Time-series data collection** + - Store historical performance metrics + - Trending analysis for capacity planning + - Delta calculations between runs + - CSV/JSON export for external tools + +- [ ] **Advanced performance metrics** + - I/O statistics (disk read/write rates) + - Network throughput monitoring + - Process-level resource tracking + - Container resource usage (if applicable) + +- [ ] **Alerting integration** + - Define threshold-based alerts + - Integration with monitoring systems (Prometheus, Nagios) + - Email notifications for critical conditions + - Configurable alert rules + +### Medium Priority + +- [ ] **Security information gathering** + - SELinux/AppArmor status and violations + - Firewall rules inventory + - Open ports and listening services + - Failed login attempts analysis + - Audit log summary + +- [ ] **Compliance reporting** + - CIS Benchmark compliance checks + - Security hardening validation + - Required package verification + - Configuration drift detection + +- [ ] **Enhanced storage analysis** + - Inode usage tracking + - Storage growth prediction + - Snapshot information (LVM, ZFS) + - RAID status detection + - NFS/CIFS mount verification + +## Version 1.2.0 - Cloud & Container Support (Q2 2026) + +### High Priority + +- [ ] **Cloud metadata collection** + - AWS EC2 instance metadata + - Azure VM metadata + - GCP instance details + - DigitalOcean droplet info + - Oracle Cloud metadata + +- [ ] **Container orchestration integration** + - Kubernetes node information + - Docker Swarm cluster details + - Podman pod information + - Container runtime statistics + +### Medium Priority + +- [ ] **Advanced Docker/Podman details** + - Container resource limits + - Volume mappings + - Network configurations + - Image layers and sizes + - Running container health + +- [ ] **Systemd service inventory** + - All enabled services + - Failed service detection + - Service dependency mapping + - Timer/scheduled task inventory + +## Version 1.3.0 - Hardware & Firmware Deep Dive (Q3 2026) + +### Medium Priority + +- [ ] **BIOS/UEFI information** + - Firmware version + - Boot mode detection + - Secure Boot status + - TPM status + +- [ ] **Hardware health monitoring** + - SMART disk health status + - Temperature sensors + - Fan speeds + - Power supply status + - RAID controller health + +- [ ] **PCI/USB device inventory** + - Detailed device information + - Driver assignments + - Vendor/device ID mapping + - Device capability detection + +### Low Priority + +- [ ] **CPU detailed analysis** + - CPU flags and capabilities + - Frequency scaling info + - Cache hierarchy details + - Hyperthreading status + - NUMA topology + +- [ ] **Memory detailed analysis** + - DIMM slot information + - Memory speed and type + - ECC status + - Memory bank details + +## Version 2.0.0 - Visualization & Reporting (Q4 2026) + +### High Priority + +- [ ] **Web dashboard generation** + - HTML report generation + - Interactive charts and graphs + - Historical trend visualization + - Comparison between hosts + +- [ ] **Export formats** + - PDF report generation + - Excel/XLSX export + - Prometheus metrics format + - InfluxDB line protocol + - Grafana JSON datasource + +### Medium Priority + +- [ ] **Inventory integration** + - CMDB population (ServiceNow, NetBox) + - Asset management integration + - Automatic inventory updates + - Change tracking and auditing + +- [ ] **Comparison and diff tools** + - Compare two hosts + - Compare current vs. historical state + - Configuration drift reports + - Change impact analysis + +## Version 2.1.0 - Advanced Features (Q1 2027) + +### Medium Priority + +- [ ] **Network topology discovery** + - Connected devices detection + - Network path tracing + - Bandwidth utilization + - Network latency measurements + +- [ ] **Software inventory** + - Installed packages list + - Package version tracking + - Available updates detection + - Vulnerable package identification + +- [ ] **Certificate management** + - SSL/TLS certificate inventory + - Expiration tracking + - Certificate chain validation + - Weak cipher detection + +### Low Priority + +- [ ] **Predictive analytics** + - Disk failure prediction + - Capacity planning recommendations + - Performance bottleneck identification + - Resource optimization suggestions + +- [ ] **Custom plugin system** + - User-defined metrics collection + - Custom validation checks + - Extensible reporting framework + - Third-party integration hooks + +## Continuous Improvements + +### Ongoing Tasks + +- [ ] **Performance optimization** + - Reduce execution time for large infrastructures + - Parallel task execution + - Fact caching optimization + - Conditional gathering based on needs + +- [ ] **Documentation** + - Comprehensive variable documentation + - Usage examples for all features + - Troubleshooting guide expansion + - Integration guides with monitoring systems + +- [ ] **Testing** + - Molecule test scenarios for all OS families + - Integration tests with monitoring systems + - Performance regression testing + - Edge case coverage + +- [ ] **Error handling** + - Graceful degradation for missing tools + - Better error messages + - Fallback mechanisms + - Logging improvements + +- [ ] **Compatibility** + - Test with newest OS versions + - Add support for emerging distributions + - Container runtime updates + - Hypervisor version compatibility + +## Deferred/Under Consideration + +- [ ] Real-time monitoring mode (daemon) +- [ ] Windows Server support +- [ ] BSD operating system support +- [ ] Mainframe and legacy system support +- [ ] Mobile device management integration +- [ ] Blockchain-based change verification + +## Completed + +- [x] Initial role creation with comprehensive system gathering (v1.0.0) +- [x] Hardware information collection (v1.0.0) +- [x] Hypervisor detection (KVM, Proxmox, LXD, Docker, Podman) (v1.0.0) +- [x] OS information gathering (v1.0.0) +- [x] Network configuration details (v1.0.0) +- [x] Storage and filesystem information (v1.0.0) +- [x] Performance metrics (CPU, memory, processes) (v1.0.0) +- [x] JSON output generation (v1.0.0) +- [x] Tag-based selective execution (v1.0.0) +- [x] Fix block-level failed_when syntax errors (v1.0.1) +- [x] Fix Jinja2/Go template conflicts (v1.0.1) +- [x] Add OS-specific variable files (v1.0.1) +- [x] CHANGELOG.md and ROADMAP.md creation (v1.0.1) + +--- + +**Last Updated**: 2025-11-11 +**Current Version**: 1.0.1 +**Next Release**: 1.1.0 (Target: Q1 2026)