Add comprehensive documentation
- Add linux-vm-deployment.md with complete deployment guide - Architecture overview and security model - Supported distributions matrix - LVM partitioning specifications - Distribution-specific configurations - Troubleshooting procedures - Performance tuning guidelines
This commit is contained in:
728
docs/debian12-vm-deployment.md
Normal file
728
docs/debian12-vm-deployment.md
Normal file
@@ -0,0 +1,728 @@
|
||||
# Debian 12 VM Deployment Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the automated deployment process for Debian 12 virtual machines on the grokbox KVM/libvirt hypervisor. The deployment uses cloud-init for unattended configuration and follows the security-first principles outlined in CLAUDE.md.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Architecture](#architecture)
|
||||
2. [Prerequisites](#prerequisites)
|
||||
3. [Deployment Process](#deployment-process)
|
||||
4. [Configuration](#configuration)
|
||||
5. [Security Features](#security-features)
|
||||
6. [Post-Deployment](#post-deployment)
|
||||
7. [Troubleshooting](#troubleshooting)
|
||||
8. [Maintenance](#maintenance)
|
||||
|
||||
## Architecture
|
||||
|
||||
### Infrastructure Components
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ grokbox (KVM Hypervisor) │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────┐ │
|
||||
│ │ libvirt/QEMU │ │
|
||||
│ │ │ │
|
||||
│ │ ┌────────────────────────────────┐ │ │
|
||||
│ │ │ Debian 12 Guest VM │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ - 2 vCPUs / 2GB RAM │ │ │
|
||||
│ │ │ - 20GB qcow2 disk │ │ │
|
||||
│ │ │ - cloud-init configured │ │ │
|
||||
│ │ │ - ansible user ready │ │ │
|
||||
│ │ │ - Security hardened │ │ │
|
||||
│ │ └────────────────────────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ Network: virbr0 (192.168.122.0/24) │ │
|
||||
│ └──────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Deployment Workflow
|
||||
|
||||
```
|
||||
[Ansible Control Node]
|
||||
│
|
||||
│ 1. SSH to grokbox
|
||||
▼
|
||||
[grokbox hypervisor]
|
||||
│
|
||||
│ 2. Download Debian cloud image
|
||||
├─ 3. Verify checksums
|
||||
├─ 4. Create VM disk (qcow2)
|
||||
├─ 5. Generate cloud-init ISO
|
||||
├─ 6. Create VM with virt-install
|
||||
│
|
||||
│ 7. VM boots with cloud-init
|
||||
▼
|
||||
[Debian 12 VM]
|
||||
│
|
||||
├─ 8. Create ansible user
|
||||
├─ 9. Configure SSH
|
||||
├─ 10. Install packages
|
||||
├─ 11. Security hardening
|
||||
└─ 12. System ready
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Hypervisor Requirements
|
||||
|
||||
On **grokbox**, ensure the following are present:
|
||||
|
||||
1. **Virtualization Support**
|
||||
```bash
|
||||
# Verify CPU virtualization
|
||||
egrep -c '(vmx|svm)' /proc/cpuinfo # Should be > 0
|
||||
|
||||
# Verify KVM module loaded
|
||||
lsmod | grep kvm
|
||||
```
|
||||
|
||||
2. **Required Packages**
|
||||
- libvirt-daemon-system
|
||||
- libvirt-clients
|
||||
- virtinst
|
||||
- qemu-kvm
|
||||
- qemu-utils
|
||||
- cloud-image-utils
|
||||
- genisoimage
|
||||
- python3-libvirt
|
||||
|
||||
3. **Sufficient Resources**
|
||||
- Storage: ~25GB available in `/var/lib/libvirt/images/`
|
||||
- Memory: Enough free RAM for VM allocation
|
||||
- Network: libvirt default network configured
|
||||
|
||||
4. **libvirtd Service Running**
|
||||
```bash
|
||||
systemctl status libvirtd
|
||||
```
|
||||
|
||||
### Ansible Control Node Requirements
|
||||
|
||||
1. Ansible 2.9 or newer
|
||||
2. SSH access to grokbox hypervisor
|
||||
3. SSH key configured for grok user
|
||||
4. Python 3.x installed
|
||||
|
||||
### Network Requirements
|
||||
|
||||
- Connectivity to Debian cloud image repository
|
||||
- DNS resolution working
|
||||
- Default libvirt network (virbr0) configured and active
|
||||
|
||||
## Deployment Process
|
||||
|
||||
### Step 1: Pre-flight Checks
|
||||
|
||||
The playbook performs the following validations:
|
||||
|
||||
- **VM Name Uniqueness**: Ensures no VM with the same name exists
|
||||
- **Virtualization Support**: Validates QEMU/KVM capabilities
|
||||
- **Package Installation**: Installs required tools if missing
|
||||
- **Service Status**: Verifies libvirtd is running
|
||||
|
||||
### Step 2: Image Management
|
||||
|
||||
#### Download Debian Cloud Image
|
||||
|
||||
- **Source**: https://cloud.debian.org/images/cloud/bookworm/latest/
|
||||
- **Image**: debian-12-generic-amd64.qcow2
|
||||
- **Cache Location**: `/var/lib/libvirt/images/debian-12-generic-amd64.qcow2`
|
||||
- **Checksum Verification**: SHA512SUMS validated
|
||||
|
||||
The base image is downloaded once and cached for subsequent deployments.
|
||||
|
||||
#### Create VM Disk
|
||||
|
||||
A new copy-on-write (CoW) disk is created using qemu-img:
|
||||
|
||||
```bash
|
||||
qemu-img create -f qcow2 \
|
||||
-F qcow2 \
|
||||
-b /var/lib/libvirt/images/debian-12-generic-amd64.qcow2 \
|
||||
/var/lib/libvirt/images/debian12-guest.qcow2 \
|
||||
20G
|
||||
```
|
||||
|
||||
This creates a thin-provisioned disk backed by the cloud image.
|
||||
|
||||
### Step 3: Cloud-Init Configuration
|
||||
|
||||
Two configuration files are generated:
|
||||
|
||||
#### meta-data
|
||||
```yaml
|
||||
instance-id: debian12-guest
|
||||
local-hostname: debian12
|
||||
```
|
||||
|
||||
#### user-data
|
||||
Comprehensive cloud-init configuration including:
|
||||
|
||||
- **User Management**: Creates ansible user with SSH keys
|
||||
- **Security Configuration**: SSH hardening, firewall setup
|
||||
- **Package Installation**: Essential and security packages
|
||||
- **System Configuration**: Time sync, locale, timezone
|
||||
- **Automatic Updates**: Unattended security upgrades
|
||||
|
||||
#### ISO Generation
|
||||
|
||||
The configuration files are packaged into a bootable ISO:
|
||||
|
||||
```bash
|
||||
genisoimage -output debian12-guest-cloud-init.iso \
|
||||
-volid cidata -joliet -rock \
|
||||
user-data meta-data
|
||||
```
|
||||
|
||||
### Step 4: VM Creation
|
||||
|
||||
VM is created using virt-install:
|
||||
|
||||
```bash
|
||||
virt-install \
|
||||
--name debian12-guest \
|
||||
--memory 2048 \
|
||||
--vcpus 2 \
|
||||
--disk path=/var/lib/libvirt/images/debian12-guest.qcow2,format=qcow2,bus=virtio \
|
||||
--disk path=/var/lib/libvirt/images/debian12-guest-cloud-init.iso,device=cdrom \
|
||||
--network network=default,model=virtio \
|
||||
--os-variant debian11 \
|
||||
--graphics none \
|
||||
--console pty,target_type=serial \
|
||||
--import \
|
||||
--noautoconsole
|
||||
```
|
||||
|
||||
### Step 5: Boot and Initialization
|
||||
|
||||
1. **VM Boots**: Starts from the qcow2 disk
|
||||
2. **Cloud-Init Runs**: Reads configuration from ISO
|
||||
3. **System Configuration**: Applies all settings
|
||||
4. **Network Configuration**: Obtains IP via DHCP
|
||||
5. **Package Updates**: Downloads and installs updates
|
||||
6. **Service Initialization**: Starts all configured services
|
||||
|
||||
**Typical boot time**: 60-90 seconds
|
||||
|
||||
### Step 6: Validation
|
||||
|
||||
The playbook validates:
|
||||
|
||||
- VM is running and accessible
|
||||
- IP address assigned
|
||||
- SSH port (22) accepting connections
|
||||
- cloud-init completed successfully
|
||||
- System resources available
|
||||
|
||||
### Step 7: Post-Deployment Configuration
|
||||
|
||||
Optional second play that:
|
||||
|
||||
- Waits for cloud-init completion
|
||||
- Gathers system facts
|
||||
- Displays system information
|
||||
- Validates disk and memory usage
|
||||
|
||||
## Configuration
|
||||
|
||||
### Default Configuration
|
||||
|
||||
```yaml
|
||||
# VM Specifications
|
||||
vm_name: "debian12-guest"
|
||||
vm_hostname: "debian12"
|
||||
vm_domain: "localdomain"
|
||||
vm_vcpus: 2
|
||||
vm_memory_mb: 2048
|
||||
vm_disk_size_gb: 20
|
||||
|
||||
# Network
|
||||
vm_network: "default"
|
||||
vm_bridge: "virbr0"
|
||||
|
||||
# Storage
|
||||
vm_disk_path: "/var/lib/libvirt/images/{{ vm_name }}.qcow2"
|
||||
cloud_init_iso_path: "/var/lib/libvirt/images/{{ vm_name }}-cloud-init.iso"
|
||||
```
|
||||
|
||||
### Customization Examples
|
||||
|
||||
#### High-Performance VM
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-debian12-vm.yml \
|
||||
-e "vm_name=app-server" \
|
||||
-e "vm_vcpus=8" \
|
||||
-e "vm_memory_mb=16384" \
|
||||
-e "vm_disk_size_gb=100"
|
||||
```
|
||||
|
||||
#### Development VM
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-debian12-vm.yml \
|
||||
-e "vm_name=dev-workstation" \
|
||||
-e "vm_vcpus=4" \
|
||||
-e "vm_memory_mb=8192" \
|
||||
-e "vm_disk_size_gb=50" \
|
||||
-e "vm_hostname=devbox" \
|
||||
-e "vm_domain=dev.local"
|
||||
```
|
||||
|
||||
#### Custom SSH Key
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-debian12-vm.yml \
|
||||
-e "vm_name=secure-vm" \
|
||||
-e "ansible_user_ssh_key='ssh-ed25519 AAAA...'"
|
||||
```
|
||||
|
||||
### Variable Precedence
|
||||
|
||||
Variables can be set in order of precedence:
|
||||
|
||||
1. **Command-line** (`-e` flag) - Highest
|
||||
2. **Playbook vars section**
|
||||
3. **Inventory host_vars**
|
||||
4. **Inventory group_vars**
|
||||
5. **Defaults in playbook** - Lowest
|
||||
|
||||
## Security Features
|
||||
|
||||
### User Management
|
||||
|
||||
- **ansible user**: Non-root service account
|
||||
- Passwordless sudo access
|
||||
- SSH key authentication only
|
||||
- Member of sudo group
|
||||
- Home directory: `/home/ansible`
|
||||
|
||||
- **root user**: Console access only
|
||||
- SSH login disabled
|
||||
- Password set for emergency console access
|
||||
- Remote access blocked
|
||||
|
||||
### SSH Hardening
|
||||
|
||||
Configuration in `/etc/ssh/sshd_config.d/99-security.conf`:
|
||||
|
||||
```
|
||||
PermitRootLogin no
|
||||
PasswordAuthentication no
|
||||
PubkeyAuthentication yes
|
||||
MaxAuthTries 3
|
||||
MaxSessions 10
|
||||
ClientAliveInterval 300
|
||||
ClientAliveCountMax 2
|
||||
```
|
||||
|
||||
### Firewall Configuration
|
||||
|
||||
- **UFW (Uncomplicated Firewall)** enabled by default
|
||||
- Default policy: deny incoming, allow outgoing
|
||||
- SSH (port 22) allowed
|
||||
- Additional rules can be added post-deployment
|
||||
|
||||
### Automatic Security Updates
|
||||
|
||||
Unattended-upgrades configured for:
|
||||
|
||||
- Automatic installation of security updates
|
||||
- Daily update checks
|
||||
- Automatic cleanup of old kernels
|
||||
- Email notifications (if configured)
|
||||
- **No automatic reboot** (requires manual intervention)
|
||||
|
||||
### Audit and Monitoring
|
||||
|
||||
- **auditd**: System call auditing enabled
|
||||
- **aide**: File integrity monitoring installed
|
||||
- **chrony**: Time synchronization configured
|
||||
- **Logging**: All cloud-init output logged
|
||||
|
||||
### Compliance Features
|
||||
|
||||
Aligned with CLAUDE.md security requirements:
|
||||
|
||||
- ✅ Principle of least privilege
|
||||
- ✅ Encryption in transit (SSH)
|
||||
- ✅ Key-based authentication
|
||||
- ✅ Automated security updates
|
||||
- ✅ System auditing enabled
|
||||
- ✅ Time synchronization
|
||||
- ✅ Firewall enabled by default
|
||||
|
||||
## Post-Deployment
|
||||
|
||||
### Adding to Inventory
|
||||
|
||||
Update your Ansible inventory:
|
||||
|
||||
```yaml
|
||||
# inventories/development/hosts.yml
|
||||
kvm_guests:
|
||||
children:
|
||||
application_servers:
|
||||
hosts:
|
||||
debian12-guest:
|
||||
ansible_host: 192.168.122.X
|
||||
ansible_user: ansible
|
||||
ansible_ssh_common_args: '-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new'
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
host_description: "Application Server - Debian 12"
|
||||
host_role: application
|
||||
host_type: virtual_machine
|
||||
hypervisor: grokbox
|
||||
vm_vcpus: 2
|
||||
vm_memory_mb: 2048
|
||||
autostart: true
|
||||
```
|
||||
|
||||
### Initial Access
|
||||
|
||||
```bash
|
||||
# Get VM IP address
|
||||
ssh grokbox "virsh domifaddr debian12-guest"
|
||||
|
||||
# SSH to VM via ProxyJump
|
||||
ssh -J grokbox ansible@192.168.122.X
|
||||
|
||||
# Or add to ~/.ssh/config
|
||||
Host debian12-guest
|
||||
HostName 192.168.122.X
|
||||
User ansible
|
||||
ProxyJump grokbox
|
||||
StrictHostKeyChecking accept-new
|
||||
```
|
||||
|
||||
### Configuration Management
|
||||
|
||||
Run additional roles or playbooks:
|
||||
|
||||
```bash
|
||||
# Example: Configure web server
|
||||
ansible-playbook -i inventories/development/hosts.yml \
|
||||
playbooks/configure-webserver.yml \
|
||||
-l debian12-guest
|
||||
|
||||
# Example: Security hardening
|
||||
ansible-playbook -i inventories/development/hosts.yml \
|
||||
playbooks/security-hardening.yml \
|
||||
-l debian12-guest
|
||||
```
|
||||
|
||||
### VM Management Commands
|
||||
|
||||
```bash
|
||||
# Start VM
|
||||
virsh start debian12-guest
|
||||
|
||||
# Shutdown VM gracefully
|
||||
virsh shutdown debian12-guest
|
||||
|
||||
# Force shutdown
|
||||
virsh destroy debian12-guest
|
||||
|
||||
# Reboot VM
|
||||
virsh reboot debian12-guest
|
||||
|
||||
# Enable autostart
|
||||
virsh autostart debian12-guest
|
||||
|
||||
# Disable autostart
|
||||
virsh autostart debian12-guest --disable
|
||||
|
||||
# VM status
|
||||
virsh dominfo debian12-guest
|
||||
|
||||
# VM resource usage
|
||||
virsh domstats debian12-guest
|
||||
|
||||
# Console access
|
||||
virsh console debian12-guest
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### 1. VM Already Exists
|
||||
|
||||
**Error**: VM with name already exists
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Check existing VMs
|
||||
virsh list --all
|
||||
|
||||
# Remove existing VM
|
||||
virsh destroy debian12-guest # if running
|
||||
virsh undefine debian12-guest --remove-all-storage
|
||||
```
|
||||
|
||||
#### 2. Image Download Fails
|
||||
|
||||
**Error**: Failed to download cloud image
|
||||
|
||||
**Causes**:
|
||||
- Network connectivity issues
|
||||
- Proxy configuration
|
||||
- DNS resolution problems
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Test connectivity
|
||||
curl -I https://cloud.debian.org
|
||||
|
||||
# Manual download
|
||||
cd /var/lib/libvirt/images
|
||||
wget https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2
|
||||
|
||||
# Re-run playbook
|
||||
ansible-playbook plays/deploy-debian12-vm.yml -t deploy
|
||||
```
|
||||
|
||||
#### 3. VM Won't Get IP Address
|
||||
|
||||
**Error**: IP address not assigned after 10 retries
|
||||
|
||||
**Causes**:
|
||||
- DHCP server not running
|
||||
- Network misconfiguration
|
||||
- VM network interface issues
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Check libvirt network
|
||||
virsh net-list --all
|
||||
virsh net-info default
|
||||
virsh net-start default # if not started
|
||||
|
||||
# Check VM network interface
|
||||
virsh domiflist debian12-guest
|
||||
|
||||
# Check DHCP leases
|
||||
virsh net-dhcp-leases default
|
||||
|
||||
# Access console to troubleshoot
|
||||
virsh console debian12-guest
|
||||
# Check: ip addr, systemctl status networking
|
||||
```
|
||||
|
||||
#### 4. SSH Connection Failed
|
||||
|
||||
**Error**: SSH connection timeout or refused
|
||||
|
||||
**Causes**:
|
||||
- SSH service not started
|
||||
- Firewall blocking
|
||||
- Wrong IP address
|
||||
- cloud-init not completed
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Verify VM is running
|
||||
virsh list
|
||||
|
||||
# Check cloud-init status via console
|
||||
virsh console debian12-guest
|
||||
# Run: cloud-init status --wait
|
||||
|
||||
# Check SSH service
|
||||
# Via console: systemctl status ssh
|
||||
|
||||
# Check firewall
|
||||
# Via console: ufw status
|
||||
|
||||
# Verify SSH key
|
||||
ssh-add -l
|
||||
```
|
||||
|
||||
#### 5. Insufficient Resources
|
||||
|
||||
**Error**: Failed to allocate memory or storage
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Check available resources
|
||||
free -h
|
||||
df -h /var/lib/libvirt/images/
|
||||
|
||||
# Adjust VM resources
|
||||
ansible-playbook plays/deploy-debian12-vm.yml \
|
||||
-e "vm_memory_mb=1024" \
|
||||
-e "vm_disk_size_gb=10"
|
||||
```
|
||||
|
||||
### Debug Mode
|
||||
|
||||
Enable verbose logging:
|
||||
|
||||
```bash
|
||||
# Ansible verbose mode
|
||||
ansible-playbook plays/deploy-debian12-vm.yml -vvv
|
||||
|
||||
# Check cloud-init logs on VM
|
||||
virsh console debian12-guest
|
||||
# Then: tail -f /var/log/cloud-init-output.log
|
||||
|
||||
# Check libvirt logs
|
||||
journalctl -u libvirtd -f
|
||||
```
|
||||
|
||||
### Health Checks
|
||||
|
||||
```bash
|
||||
# Verify VM health
|
||||
virsh dominfo debian12-guest
|
||||
virsh domstats debian12-guest
|
||||
|
||||
# Network connectivity
|
||||
ping $(virsh domifaddr debian12-guest | grep -oP '(\d{1,3}\.){3}\d{1,3}' | head -1)
|
||||
|
||||
# SSH connectivity
|
||||
ssh -J grokbox ansible@$(virsh domifaddr debian12-guest | grep -oP '(\d{1,3}\.){3}\d{1,3}' | head -1) "echo 'VM is accessible'"
|
||||
```
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Updating the Base Image
|
||||
|
||||
Periodically update the cached Debian cloud image:
|
||||
|
||||
```bash
|
||||
# Remove old image
|
||||
ssh grokbox "rm /var/lib/libvirt/images/debian-12-generic-amd64.qcow2"
|
||||
|
||||
# Download latest
|
||||
ansible-playbook plays/deploy-debian12-vm.yml -t download,verify
|
||||
```
|
||||
|
||||
### VM Snapshots
|
||||
|
||||
Create snapshots before major changes:
|
||||
|
||||
```bash
|
||||
# Create snapshot
|
||||
virsh snapshot-create-as debian12-guest \
|
||||
snapshot1 \
|
||||
"Before application deployment"
|
||||
|
||||
# List snapshots
|
||||
virsh snapshot-list debian12-guest
|
||||
|
||||
# Revert to snapshot
|
||||
virsh snapshot-revert debian12-guest snapshot1
|
||||
|
||||
# Delete snapshot
|
||||
virsh snapshot-delete debian12-guest snapshot1
|
||||
```
|
||||
|
||||
### Backup and Restore
|
||||
|
||||
#### Backup VM
|
||||
|
||||
```bash
|
||||
# Stop VM
|
||||
virsh shutdown debian12-guest
|
||||
|
||||
# Backup disk
|
||||
cp /var/lib/libvirt/images/debian12-guest.qcow2 \
|
||||
/backup/debian12-guest-$(date +%Y%m%d).qcow2
|
||||
|
||||
# Backup XML config
|
||||
virsh dumpxml debian12-guest > /backup/debian12-guest.xml
|
||||
|
||||
# Start VM
|
||||
virsh start debian12-guest
|
||||
```
|
||||
|
||||
#### Restore VM
|
||||
|
||||
```bash
|
||||
# Copy disk back
|
||||
cp /backup/debian12-guest-20241110.qcow2 \
|
||||
/var/lib/libvirt/images/debian12-guest.qcow2
|
||||
|
||||
# Define VM from XML
|
||||
virsh define /backup/debian12-guest.xml
|
||||
|
||||
# Start VM
|
||||
virsh start debian12-guest
|
||||
```
|
||||
|
||||
### Resize VM Disk
|
||||
|
||||
```bash
|
||||
# Shutdown VM
|
||||
virsh shutdown debian12-guest
|
||||
|
||||
# Resize disk
|
||||
qemu-img resize /var/lib/libvirt/images/debian12-guest.qcow2 +10G
|
||||
|
||||
# Start VM
|
||||
virsh start debian12-guest
|
||||
|
||||
# On VM: resize partition and filesystem
|
||||
growpart /dev/vda 1
|
||||
resize2fs /dev/vda1
|
||||
```
|
||||
|
||||
### Resource Adjustment
|
||||
|
||||
Modify VM resources:
|
||||
|
||||
```bash
|
||||
# Set maximum memory (requires shutdown)
|
||||
virsh setmaxmem debian12-guest 4194304 --config
|
||||
|
||||
# Set current memory (can be done live)
|
||||
virsh setmem debian12-guest 4194304
|
||||
|
||||
# Set vCPUs (requires shutdown)
|
||||
virsh setvcpus debian12-guest 4 --config --maximum
|
||||
virsh setvcpus debian12-guest 4 --config
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Naming Convention**: Use descriptive VM names indicating purpose
|
||||
2. **Resource Planning**: Right-size VMs to avoid waste
|
||||
3. **Documentation**: Document VM purpose and configuration
|
||||
4. **Monitoring**: Set up monitoring for critical VMs
|
||||
5. **Backups**: Regular backups of important VMs
|
||||
6. **Updates**: Keep VMs updated with security patches
|
||||
7. **Inventory**: Maintain accurate Ansible inventory
|
||||
8. **Tags**: Use libvirt tags for organization
|
||||
9. **Networking**: Use appropriate network isolation
|
||||
10. **Testing**: Test deployment process in development first
|
||||
|
||||
## References
|
||||
|
||||
- [CLAUDE.md](../CLAUDE.md) - Infrastructure guidelines
|
||||
- [Cheatsheet](../cheatsheets/deploy-debian12-vm.md) - Quick reference
|
||||
- [Debian Cloud Images](https://cloud.debian.org/images/cloud/)
|
||||
- [cloud-init Documentation](https://cloudinit.readthedocs.io/)
|
||||
- [libvirt Documentation](https://libvirt.org/docs.html)
|
||||
- [virt-install man page](https://linux.die.net/man/1/virt-install)
|
||||
|
||||
## Support and Contact
|
||||
|
||||
For issues or questions:
|
||||
|
||||
1. Check troubleshooting section above
|
||||
2. Review cloud-init logs: `/var/log/cloud-init.log`
|
||||
3. Review libvirt logs: `journalctl -u libvirtd`
|
||||
4. Consult Ansible playbook: `plays/deploy-debian12-vm.yml`
|
||||
|
||||
---
|
||||
|
||||
**Document Version**: 1.0
|
||||
**Last Updated**: 2025-11-10
|
||||
**Maintained By**: Ansible Infrastructure Team
|
||||
516
docs/inventory.md
Normal file
516
docs/inventory.md
Normal file
@@ -0,0 +1,516 @@
|
||||
# Ansible Inventory Configuration
|
||||
|
||||
This document describes the dynamic and static inventory configurations for the Ansible infrastructure.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Inventory Structure](#inventory-structure)
|
||||
3. [Dynamic Inventory Solutions](#dynamic-inventory-solutions)
|
||||
4. [Static/Hybrid Inventory](#statichybrid-inventory)
|
||||
5. [Usage Examples](#usage-examples)
|
||||
6. [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Per the CLAUDE.md guidelines, this infrastructure uses **dynamic inventories** as the primary inventory source, with static inventories permitted for development environments only.
|
||||
|
||||
### Available Inventory Solutions
|
||||
|
||||
| Solution | Type | Use Case | Status |
|
||||
|----------|------|----------|--------|
|
||||
| SSH Config Parser | Dynamic | Quick discovery from SSH config | ✅ Active |
|
||||
| Libvirt/KVM Plugin | Dynamic | Real-time VM discovery | ✅ Active |
|
||||
| Static YAML | Static/Hybrid | Development environment | ✅ Active |
|
||||
|
||||
---
|
||||
|
||||
## Inventory Structure
|
||||
|
||||
```
|
||||
inventories/
|
||||
├── production/ # Production environment (dynamic only)
|
||||
│ ├── group_vars/
|
||||
│ │ └── all.yml
|
||||
│ └── [dynamic inventory configs]
|
||||
├── staging/ # Staging environment (dynamic only)
|
||||
│ ├── group_vars/
|
||||
│ │ └── all.yml
|
||||
│ └── [dynamic inventory configs]
|
||||
└── development/ # Development environment
|
||||
├── hosts.yml # Static/hybrid inventory
|
||||
├── libvirt_kvm.yml # Libvirt dynamic config
|
||||
├── group_vars/
|
||||
│ ├── all.yml
|
||||
│ ├── kvm_guests.yml
|
||||
│ └── hypervisors.yml
|
||||
└── host_vars/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dynamic Inventory Solutions
|
||||
|
||||
### 1. SSH Config Parser (`ssh_config_inventory.py`)
|
||||
|
||||
**Location:** `/opt/ansible/plugins/inventory/ssh_config_inventory.py`
|
||||
|
||||
#### Description
|
||||
Parses `~/.ssh/config` to automatically generate Ansible inventory with proper grouping and connection parameters.
|
||||
|
||||
#### Features
|
||||
- Automatic host discovery from SSH config
|
||||
- Intelligent grouping by host characteristics
|
||||
- ProxyJump support for nested VM access
|
||||
- No external dependencies (pure Python)
|
||||
|
||||
#### Usage
|
||||
|
||||
```bash
|
||||
# List all hosts and groups
|
||||
python3 plugins/inventory/ssh_config_inventory.py --list
|
||||
|
||||
# Get variables for specific host
|
||||
python3 plugins/inventory/ssh_config_inventory.py --host pihole
|
||||
|
||||
# Use with ansible commands
|
||||
ansible all -i plugins/inventory/ssh_config_inventory.py --list-hosts
|
||||
|
||||
# Use with playbooks
|
||||
ansible-playbook -i plugins/inventory/ssh_config_inventory.py site.yml
|
||||
```
|
||||
|
||||
#### Host Categorization Logic
|
||||
|
||||
| Category | Criteria |
|
||||
|----------|----------|
|
||||
| `external_hosts` | Public IPs, no ProxyJump, non-ansible user |
|
||||
| `hypervisors` | ForwardAgent enabled, specific users (grok) |
|
||||
| `dns_servers` | ansible user + ProxyJump + hostname contains 'pihole'/'dns' |
|
||||
| `mail_servers` | ansible user + ProxyJump + hostname contains 'mail'/'mx' |
|
||||
| `development` | ansible user + ProxyJump + hostname contains 'dev'/'test'/'derp' |
|
||||
|
||||
#### Generated Inventory Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"all": {
|
||||
"children": ["external_hosts", "hypervisors", "kvm_guests"]
|
||||
},
|
||||
"external_hosts": {
|
||||
"hosts": ["odin"]
|
||||
},
|
||||
"hypervisors": {
|
||||
"hosts": ["grokbox"]
|
||||
},
|
||||
"kvm_guests": {
|
||||
"children": ["dns_servers", "mail_servers", "development", "uncategorized"],
|
||||
"vars": {
|
||||
"ansible_user": "ansible",
|
||||
"ansible_ssh_common_args": "-o StrictHostKeyChecking=accept-new"
|
||||
}
|
||||
},
|
||||
"_meta": {
|
||||
"hostvars": { ... }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Libvirt/KVM Dynamic Inventory (`libvirt_kvm.py`)
|
||||
|
||||
**Location:** `/opt/ansible/plugins/inventory/libvirt_kvm.py`
|
||||
|
||||
#### Description
|
||||
Queries libvirt hypervisors directly to discover KVM guest VMs in real-time, including their state, resources, and network configuration.
|
||||
|
||||
#### Features
|
||||
- Real-time VM discovery via libvirt API
|
||||
- VM state detection (running, stopped, paused)
|
||||
- Automatic IP address detection
|
||||
- Resource information (vCPUs, memory, networks)
|
||||
- Multiple hypervisor support
|
||||
- ProxyJump configuration
|
||||
|
||||
#### Requirements
|
||||
|
||||
```bash
|
||||
# Debian/Ubuntu
|
||||
apt-get install python3-libvirt
|
||||
|
||||
# RHEL/Fedora/Rocky
|
||||
dnf install python3-libvirt
|
||||
```
|
||||
|
||||
#### Configuration
|
||||
|
||||
Set environment variables or use configuration file:
|
||||
|
||||
```bash
|
||||
# Environment variables
|
||||
export LIBVIRT_DEFAULT_URI="qemu+ssh://grok@grok.home.serneels.xyz/system"
|
||||
export LIBVIRT_HYPERVISOR_NAME="grokbox"
|
||||
```
|
||||
|
||||
Or use YAML configuration file: `inventories/development/libvirt_kvm.yml`
|
||||
|
||||
#### Usage
|
||||
|
||||
```bash
|
||||
# List all VMs
|
||||
python3 plugins/inventory/libvirt_kvm.py --list
|
||||
|
||||
# Get specific VM details
|
||||
python3 plugins/inventory/libvirt_kvm.py --host mymx
|
||||
|
||||
# Use with ansible
|
||||
ansible running_vms -i plugins/inventory/libvirt_kvm.py -m ping
|
||||
|
||||
# Use with playbooks
|
||||
ansible-playbook -i plugins/inventory/libvirt_kvm.py playbooks/update.yml
|
||||
```
|
||||
|
||||
#### Generated Groups
|
||||
|
||||
| Group | Description |
|
||||
|-------|-------------|
|
||||
| `hypervisors` | KVM hypervisor hosts |
|
||||
| `kvm_guests` | All guest VMs |
|
||||
| `running_vms` | VMs in running state |
|
||||
| `stopped_vms` | VMs not running (shutoff, paused, etc.) |
|
||||
|
||||
#### Host Variables
|
||||
|
||||
Each VM includes:
|
||||
- `vm_name`: VM hostname
|
||||
- `vm_uuid`: Libvirt UUID
|
||||
- `vm_state`: Current state (running, shutoff, etc.)
|
||||
- `vm_vcpus`: Number of virtual CPUs
|
||||
- `vm_memory_mb`: Memory allocation in MB
|
||||
- `vm_networks`: Network interface details
|
||||
- `ansible_host`: IP address (if available)
|
||||
- `ansible_ssh_common_args`: ProxyJump configuration
|
||||
- `hypervisor`: Parent hypervisor name
|
||||
|
||||
#### Example Output
|
||||
|
||||
```json
|
||||
{
|
||||
"running_vms": {
|
||||
"hosts": ["mymx", "pihole", "derp"]
|
||||
},
|
||||
"_meta": {
|
||||
"hostvars": {
|
||||
"pihole": {
|
||||
"vm_name": "pihole",
|
||||
"vm_uuid": "6d714c93-16fb-41c8-8ef8-9001f9066b3a",
|
||||
"vm_state": "running",
|
||||
"vm_vcpus": 2,
|
||||
"vm_memory_mb": 2048,
|
||||
"ansible_host": "192.168.122.12",
|
||||
"ansible_ssh_common_args": "-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new",
|
||||
"hypervisor": "grokbox"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Static/Hybrid Inventory
|
||||
|
||||
**Location:** `/opt/ansible/inventories/development/hosts.yml`
|
||||
|
||||
### Description
|
||||
Manually maintained static inventory for the development environment with detailed host metadata and configuration.
|
||||
|
||||
### Structure
|
||||
|
||||
```yaml
|
||||
all:
|
||||
children:
|
||||
external_hosts: # Public-facing hosts
|
||||
hypervisors: # KVM hypervisor hosts
|
||||
kvm_guests: # Virtual machine guests
|
||||
children:
|
||||
dns_servers:
|
||||
mail_servers:
|
||||
development:
|
||||
uncategorized:
|
||||
```
|
||||
|
||||
### Group Variables
|
||||
|
||||
Variables are defined in `group_vars/` directory:
|
||||
|
||||
- **`all.yml`**: Global variables for all hosts
|
||||
- **`kvm_guests.yml`**: Common VM configuration (LVM, networking, ProxyJump)
|
||||
- **`hypervisors.yml`**: Hypervisor-specific settings (libvirt, QEMU)
|
||||
|
||||
### Host Variables
|
||||
|
||||
Host-specific variables can be placed in `host_vars/` directory:
|
||||
|
||||
```
|
||||
host_vars/
|
||||
├── pihole.yml
|
||||
├── mymx.yml
|
||||
└── derp.yml
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
# List hosts
|
||||
ansible all -i inventories/development/hosts.yml --list-hosts
|
||||
|
||||
# Run playbook
|
||||
ansible-playbook -i inventories/development/hosts.yml site.yml
|
||||
|
||||
# Target specific group
|
||||
ansible dns_servers -i inventories/development/hosts.yml -m ping
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: List All Hosts (Dynamic)
|
||||
|
||||
```bash
|
||||
# Using SSH config parser
|
||||
ansible all -i plugins/inventory/ssh_config_inventory.py --list-hosts
|
||||
|
||||
# Using libvirt inventory
|
||||
ansible all -i plugins/inventory/libvirt_kvm.py --list-hosts
|
||||
```
|
||||
|
||||
### Example 2: Ping All Running VMs
|
||||
|
||||
```bash
|
||||
ansible running_vms -i plugins/inventory/libvirt_kvm.py -m ping
|
||||
```
|
||||
|
||||
### Example 3: Run Playbook Against KVM Guests
|
||||
|
||||
```bash
|
||||
ansible-playbook -i inventories/development/hosts.yml \
|
||||
--limit kvm_guests \
|
||||
playbooks/system-update.yml
|
||||
```
|
||||
|
||||
### Example 4: Check Host Variables
|
||||
|
||||
```bash
|
||||
# Using dynamic inventory
|
||||
ansible-inventory -i plugins/inventory/libvirt_kvm.py --host pihole
|
||||
|
||||
# Using static inventory
|
||||
ansible-inventory -i inventories/development/hosts.yml --host pihole --yaml
|
||||
```
|
||||
|
||||
### Example 5: Multiple Inventory Sources
|
||||
|
||||
You can combine multiple inventory sources:
|
||||
|
||||
```bash
|
||||
ansible-playbook -i inventories/development/hosts.yml \
|
||||
-i plugins/inventory/libvirt_kvm.py \
|
||||
site.yml
|
||||
```
|
||||
|
||||
### Example 6: Filter by Group
|
||||
|
||||
```bash
|
||||
# Target only mail servers
|
||||
ansible mail_servers -i plugins/inventory/ssh_config_inventory.py -m setup
|
||||
|
||||
# Target only hypervisors
|
||||
ansible hypervisors -i inventories/development/hosts.yml -m shell -a "virsh list --all"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Ansible Configuration
|
||||
|
||||
Configure default inventory in `ansible.cfg`:
|
||||
|
||||
```ini
|
||||
[defaults]
|
||||
inventory = ./inventories/development/hosts.yml
|
||||
# Or use dynamic:
|
||||
# inventory = ./plugins/inventory/libvirt_kvm.py
|
||||
|
||||
# Enable multiple inventory sources
|
||||
# inventory = ./inventories/development/hosts.yml,./plugins/inventory/libvirt_kvm.py
|
||||
|
||||
# Inventory plugins path
|
||||
inventory_plugins = ./plugins/inventory
|
||||
|
||||
# Enable fact caching for performance
|
||||
gathering = smart
|
||||
fact_caching = jsonfile
|
||||
fact_caching_connection = /tmp/ansible_facts
|
||||
fact_caching_timeout = 86400
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### SSH Config Parser Issues
|
||||
|
||||
**Problem:** Hosts not appearing in inventory
|
||||
|
||||
**Solution:**
|
||||
- Check `~/.ssh/config` exists and is readable
|
||||
- Verify Host declarations are properly formatted
|
||||
- Run with `--list` to see parsed output
|
||||
- Check for Python syntax errors
|
||||
|
||||
**Problem:** Incorrect host categorization
|
||||
|
||||
**Solution:**
|
||||
- Review categorization logic in `_categorize_host()` method
|
||||
- Add custom categorization rules
|
||||
- Use static inventory for specific grouping needs
|
||||
|
||||
### Libvirt Inventory Issues
|
||||
|
||||
**Problem:** `python3-libvirt` not installed
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Debian/Ubuntu
|
||||
sudo apt-get install python3-libvirt
|
||||
|
||||
# RHEL/Rocky/Fedora
|
||||
sudo dnf install python3-libvirt
|
||||
```
|
||||
|
||||
**Problem:** Connection to hypervisor fails
|
||||
|
||||
**Solution:**
|
||||
- Verify SSH access to hypervisor: `ssh grok@grok.home.serneels.xyz`
|
||||
- Check libvirt URI: `virsh -c qemu+ssh://grok@grok.home.serneels.xyz/system list`
|
||||
- Ensure SSH keys are properly configured
|
||||
- Check SSH agent forwarding if needed
|
||||
|
||||
**Problem:** VMs discovered but no IP addresses
|
||||
|
||||
**Solution:**
|
||||
- VMs may not have DHCP leases yet
|
||||
- Check VM is fully booted: `virsh dominfo <vm_name>`
|
||||
- Manually query: `virsh domifaddr <vm_name>`
|
||||
- Use static inventory with known IP addresses
|
||||
|
||||
### Static Inventory Issues
|
||||
|
||||
**Problem:** YAML syntax errors
|
||||
|
||||
**Solution:**
|
||||
- Validate YAML syntax: `yamllint inventories/development/hosts.yml`
|
||||
- Check indentation (use 2 spaces)
|
||||
- Verify with: `ansible-inventory -i inventories/development/hosts.yml --list`
|
||||
|
||||
**Problem:** Variables not being applied
|
||||
|
||||
**Solution:**
|
||||
- Check variable precedence order
|
||||
- Verify `group_vars/` and `host_vars/` file names match group/host names
|
||||
- Use `ansible-inventory --host <hostname>` to debug variable merging
|
||||
|
||||
### General Debugging
|
||||
|
||||
```bash
|
||||
# Verify inventory parsing
|
||||
ansible-inventory -i <inventory_source> --list
|
||||
|
||||
# Check specific host variables
|
||||
ansible-inventory -i <inventory_source> --host <hostname>
|
||||
|
||||
# Graph inventory structure
|
||||
ansible-inventory -i <inventory_source> --graph
|
||||
|
||||
# Test connectivity
|
||||
ansible all -i <inventory_source> -m ping -vvv
|
||||
|
||||
# Dry run playbook
|
||||
ansible-playbook -i <inventory_source> site.yml --check --diff
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### SSH Config Parser
|
||||
- ✅ No credentials stored in inventory
|
||||
- ✅ Uses existing SSH configuration
|
||||
- ⚠️ Ensure `~/.ssh/config` has proper permissions (600)
|
||||
|
||||
### Libvirt Inventory
|
||||
- ✅ Uses SSH key authentication
|
||||
- ✅ No passwords in configuration
|
||||
- ⚠️ Requires SSH access to hypervisor
|
||||
- ⚠️ Libvirt connection string may be logged
|
||||
|
||||
### Static Inventory
|
||||
- ✅ Version controlled and auditable
|
||||
- ⚠️ Use Ansible Vault for sensitive variables
|
||||
- ⚠️ Never commit unencrypted credentials
|
||||
|
||||
### Best Practices
|
||||
- Use Ansible Vault for secrets: `ansible-vault encrypt group_vars/all/vault.yml`
|
||||
- Rotate SSH keys regularly (90-180 days per CLAUDE.md)
|
||||
- Use ProxyJump/bastion hosts for nested VM access
|
||||
- Enable SSH ControlMaster for connection reuse
|
||||
- Implement inventory caching for large infrastructures
|
||||
|
||||
---
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### Caching
|
||||
Enable fact caching in `ansible.cfg`:
|
||||
```ini
|
||||
[defaults]
|
||||
gathering = smart
|
||||
fact_caching = jsonfile
|
||||
fact_caching_connection = /tmp/ansible_facts
|
||||
fact_caching_timeout = 86400
|
||||
```
|
||||
|
||||
### Parallelism
|
||||
Adjust fork count:
|
||||
```ini
|
||||
[defaults]
|
||||
forks = 20
|
||||
```
|
||||
|
||||
### SSH Connection Reuse
|
||||
Configure ControlMaster in `~/.ssh/config`:
|
||||
```
|
||||
Host *
|
||||
ControlMaster auto
|
||||
ControlPath ~/.ssh/sockets/%r@%h-%p
|
||||
ControlPersist 600s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Ansible Dynamic Inventory](https://docs.ansible.com/ansible/latest/user_guide/intro_dynamic_inventory.html)
|
||||
- [Libvirt Python API](https://libvirt.org/python.html)
|
||||
- [SSH Config Documentation](https://man.openbsd.org/ssh_config)
|
||||
- [CLAUDE.md Guidelines](/opt/ansible/CLAUDE.md)
|
||||
|
||||
---
|
||||
|
||||
**Document Version:** 1.0.0
|
||||
**Last Updated:** 2025-11-10
|
||||
**Maintainer:** Ansible Infrastructure Team
|
||||
944
docs/linux-vm-deployment.md
Normal file
944
docs/linux-vm-deployment.md
Normal file
@@ -0,0 +1,944 @@
|
||||
# Multi-Distribution Linux VM Deployment Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the automated deployment process for multiple Linux distributions on KVM/libvirt hypervisors. The deployment supports major server distributions including Debian, Ubuntu, RHEL, CentOS Stream, Rocky Linux, AlmaLinux, SLES, and openSUSE Leap.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Supported Distributions](#supported-distributions)
|
||||
2. [Architecture](#architecture)
|
||||
3. [Prerequisites](#prerequisites)
|
||||
4. [Cloud Image Sources](#cloud-image-sources)
|
||||
5. [Deployment Process](#deployment-process)
|
||||
6. [Distribution-Specific Configuration](#distribution-specific-configuration)
|
||||
7. [Security Features](#security-features)
|
||||
8. [Post-Deployment](#post-deployment)
|
||||
9. [Troubleshooting](#troubleshooting)
|
||||
10. [Best Practices](#best-practices)
|
||||
|
||||
## Supported Distributions
|
||||
|
||||
### Debian Family
|
||||
|
||||
| Distribution | Version | Package Manager | Firewall | Cloud Image |
|
||||
|--------------|---------|----------------|----------|-------------|
|
||||
| Debian | 11 (Bullseye) | apt | ufw | ✅ Auto-download |
|
||||
| Debian | 12 (Bookworm) | apt | ufw | ✅ Auto-download |
|
||||
| Ubuntu | 20.04 LTS (Focal) | apt | ufw | ✅ Auto-download |
|
||||
| Ubuntu | 22.04 LTS (Jammy) | apt | ufw | ✅ Auto-download |
|
||||
| Ubuntu | 24.04 LTS (Noble) | apt | ufw | ✅ Auto-download |
|
||||
|
||||
### RHEL Family
|
||||
|
||||
| Distribution | Version | Package Manager | Firewall | SELinux | Cloud Image |
|
||||
|--------------|---------|----------------|----------|---------|-------------|
|
||||
| RHEL | 8, 9 | dnf | firewalld | Enforcing | ⚠️ Manual download |
|
||||
| CentOS Stream | 8, 9 | dnf | firewalld | Enforcing | ✅ Auto-download |
|
||||
| Rocky Linux | 8, 9 | dnf | firewalld | Enforcing | ✅ Auto-download |
|
||||
| AlmaLinux | 8, 9 | dnf | firewalld | Enforcing | ✅ Auto-download |
|
||||
|
||||
### SUSE Family
|
||||
|
||||
| Distribution | Version | Package Manager | Firewall | Cloud Image |
|
||||
|--------------|---------|----------------|----------|-------------|
|
||||
| SLES | 15 | zypper | firewalld | ⚠️ Manual download |
|
||||
| openSUSE Leap | 15.5, 15.6 | zypper | firewalld | ✅ Auto-download |
|
||||
|
||||
**Legend:**
|
||||
- ✅ = Automatically downloaded from official repositories
|
||||
- ⚠️ = Requires subscription and manual download
|
||||
|
||||
## Architecture
|
||||
|
||||
### Multi-Distribution Support Design
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Ansible Control Node │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────┐ │
|
||||
│ │ deploy-linux-vm.yml │ │
|
||||
│ │ │ │
|
||||
│ │ • Distribution Selection Logic │ │
|
||||
│ │ • Cloud Image Repository Map │ │
|
||||
│ │ • OS Family Detection │ │
|
||||
│ │ • Package Manager Adaptation │ │
|
||||
│ └────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
│
|
||||
│ SSH
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ KVM Hypervisor (grokbox) │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ Cloud Image Cache │ │
|
||||
│ │ /var/lib/libvirt/images/ │ │
|
||||
│ │ ├─ debian-12-*.qcow2 │ │
|
||||
│ │ ├─ ubuntu-22.04-*.img │ │
|
||||
│ │ ├─ centos-stream-9-*.qcow2 │ │
|
||||
│ │ ├─ rocky-9-*.qcow2 │ │
|
||||
│ │ └─ almalinux-9-*.qcow2 │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ libvirt/QEMU │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ Debian VM │ │ Ubuntu VM │ │ │
|
||||
│ │ │ ufw enabled │ │ ufw enabled │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ Rocky VM │ │ Alma VM │ │ │
|
||||
│ │ │ SELinux=Enf │ │ SELinux=Enf │ │ │
|
||||
│ │ │ firewalld │ │ firewalld │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Deployment Workflow
|
||||
|
||||
```
|
||||
[Start]
|
||||
│
|
||||
▼
|
||||
[Validate Distribution Selection]
|
||||
│
|
||||
├─ Check distribution in supported list
|
||||
├─ Set distribution facts (family, package manager, etc.)
|
||||
└─ Display deployment configuration
|
||||
│
|
||||
▼
|
||||
[Pre-flight Checks]
|
||||
│
|
||||
├─ Verify VM doesn't already exist
|
||||
├─ Validate virtualization support
|
||||
└─ Install required packages on hypervisor
|
||||
│
|
||||
▼
|
||||
[Download Cloud Image]
|
||||
│
|
||||
├─ Check if image cached
|
||||
├─ Download from official repository
|
||||
└─ Verify checksum (SHA256/SHA512)
|
||||
│
|
||||
▼
|
||||
[Create VM Storage]
|
||||
│
|
||||
├─ Create qcow2 disk (CoW from base image)
|
||||
└─ Set proper permissions (libvirt-qemu/qemu)
|
||||
│
|
||||
▼
|
||||
[Generate Cloud-Init Configuration]
|
||||
│
|
||||
├─ Select template based on OS family:
|
||||
│ ├─ Debian/Ubuntu → apt, ufw, unattended-upgrades
|
||||
│ ├─ RHEL family → dnf, firewalld, SELinux, dnf-automatic
|
||||
│ └─ SUSE family → zypper, firewalld
|
||||
│
|
||||
├─ Create meta-data (hostname, instance-id)
|
||||
├─ Create user-data (users, packages, security)
|
||||
└─ Generate cloud-init ISO
|
||||
│
|
||||
▼
|
||||
[Deploy VM]
|
||||
│
|
||||
├─ Run virt-install with appropriate os-variant
|
||||
├─ Attach disk and cloud-init ISO
|
||||
└─ Start VM
|
||||
│
|
||||
▼
|
||||
[Wait for Boot]
|
||||
│
|
||||
├─ VM boots from qcow2 disk
|
||||
├─ Cloud-init runs configuration
|
||||
├─ Network configured via DHCP
|
||||
└─ Get IP address from libvirt
|
||||
│
|
||||
▼
|
||||
[Validation]
|
||||
│
|
||||
├─ Test SSH connectivity
|
||||
├─ Verify cloud-init completion
|
||||
├─ Display VM information
|
||||
└─ System health checks
|
||||
│
|
||||
▼
|
||||
[Cleanup]
|
||||
│
|
||||
└─ Remove temporary files
|
||||
│
|
||||
▼
|
||||
[Complete]
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Hypervisor Requirements
|
||||
|
||||
**Hardware:**
|
||||
- CPU with virtualization extensions (Intel VT-x or AMD-V)
|
||||
- Sufficient RAM for host + guest VMs
|
||||
- Adequate storage for cloud images and VM disks
|
||||
|
||||
**Software:**
|
||||
|
||||
**For Debian/Ubuntu hypervisors:**
|
||||
```bash
|
||||
apt install -y \
|
||||
libvirt-daemon-system \
|
||||
libvirt-clients \
|
||||
virtinst \
|
||||
qemu-kvm \
|
||||
qemu-utils \
|
||||
cloud-image-utils \
|
||||
genisoimage \
|
||||
python3-libvirt
|
||||
```
|
||||
|
||||
**For RHEL/CentOS/Rocky/Alma hypervisors:**
|
||||
```bash
|
||||
dnf install -y \
|
||||
libvirt \
|
||||
libvirt-client \
|
||||
virt-install \
|
||||
qemu-kvm \
|
||||
qemu-img \
|
||||
cloud-utils \
|
||||
genisoimage \
|
||||
python3-libvirt
|
||||
```
|
||||
|
||||
**Services:**
|
||||
```bash
|
||||
systemctl enable --now libvirtd
|
||||
```
|
||||
|
||||
### Network Requirements
|
||||
|
||||
- Internet connectivity for cloud image downloads
|
||||
- DNS resolution working
|
||||
- libvirt default network active:
|
||||
```bash
|
||||
virsh net-list
|
||||
virsh net-start default # if not started
|
||||
virsh net-autostart default
|
||||
```
|
||||
|
||||
### Ansible Control Node
|
||||
|
||||
- Ansible 2.9 or newer
|
||||
- SSH access to hypervisor
|
||||
- Python 3.x installed
|
||||
|
||||
## Cloud Image Sources
|
||||
|
||||
### Official Repositories
|
||||
|
||||
**Debian:**
|
||||
- URL: https://cloud.debian.org/images/cloud/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA512SUMS provided
|
||||
- Update Frequency: Regular (stable releases)
|
||||
|
||||
**Ubuntu:**
|
||||
- URL: https://cloud-images.ubuntu.com/
|
||||
- Format: img (qcow2 compatible)
|
||||
- Checksum: SHA256SUMS provided
|
||||
- Update Frequency: Daily builds available
|
||||
|
||||
**CentOS Stream:**
|
||||
- URL: https://cloud.centos.org/centos/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 CHECKSUM file
|
||||
- Update Frequency: Regular updates
|
||||
|
||||
**Rocky Linux:**
|
||||
- URL: https://download.rockylinux.org/pub/rocky/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 CHECKSUM file
|
||||
- Update Frequency: Regular with point releases
|
||||
|
||||
**AlmaLinux:**
|
||||
- URL: https://repo.almalinux.org/almalinux/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 CHECKSUM file
|
||||
- Update Frequency: Regular with point releases
|
||||
|
||||
**openSUSE Leap:**
|
||||
- URL: https://download.opensuse.org/distribution/leap/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 per-file
|
||||
- Update Frequency: Per release cycle
|
||||
|
||||
### Manual Download Required
|
||||
|
||||
**Red Hat Enterprise Linux (RHEL):**
|
||||
- Requires: Red Hat subscription
|
||||
- Portal: https://access.redhat.com/downloads/
|
||||
- Steps:
|
||||
1. Log in to Red Hat Customer Portal
|
||||
2. Navigate to Downloads
|
||||
3. Select "Red Hat Enterprise Linux"
|
||||
4. Download KVM Guest Image
|
||||
5. Place at: `/var/lib/libvirt/images/rhel-X-x86_64-kvm.qcow2`
|
||||
|
||||
**SUSE Linux Enterprise Server (SLES):**
|
||||
- Requires: SUSE subscription
|
||||
- Portal: https://scc.suse.com/
|
||||
- Steps:
|
||||
1. Log in to SUSE Customer Center
|
||||
2. Download cloud image for SLES 15
|
||||
3. Place at: `/var/lib/libvirt/images/sles-15-genericcloud-amd64.qcow2`
|
||||
|
||||
## Deployment Process
|
||||
|
||||
### Basic Deployment
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-linux-vm.yml \
|
||||
-e "os_distribution=<distro-version>" \
|
||||
-e "vm_name=<name>"
|
||||
```
|
||||
|
||||
### Distribution Selection
|
||||
|
||||
The `os_distribution` variable determines which Linux distribution to deploy. Format: `distro-version`
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
# Debian
|
||||
-e "os_distribution=debian-12"
|
||||
|
||||
# Ubuntu
|
||||
-e "os_distribution=ubuntu-22.04"
|
||||
|
||||
# CentOS Stream
|
||||
-e "os_distribution=centos-stream-9"
|
||||
|
||||
# Rocky Linux
|
||||
-e "os_distribution=rocky-9"
|
||||
|
||||
# AlmaLinux
|
||||
-e "os_distribution=almalinux-9"
|
||||
|
||||
# openSUSE
|
||||
-e "os_distribution=opensuse-leap-15.6"
|
||||
```
|
||||
|
||||
### Resource Customization
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-linux-vm.yml \
|
||||
-e "os_distribution=rocky-9" \
|
||||
-e "vm_name=database-server" \
|
||||
-e "vm_hostname=db01" \
|
||||
-e "vm_domain=production.local" \
|
||||
-e "vm_vcpus=8" \
|
||||
-e "vm_memory_mb=16384" \
|
||||
-e "vm_disk_size_gb=200"
|
||||
```
|
||||
|
||||
### Configuration Variables
|
||||
|
||||
| Variable | Type | Default | Description |
|
||||
|----------|------|---------|-------------|
|
||||
| `os_distribution` | **Required** | debian-12 | Distribution identifier |
|
||||
| `vm_name` | String | linux-guest | VM name in libvirt |
|
||||
| `vm_hostname` | String | linux-vm | Guest hostname |
|
||||
| `vm_domain` | String | localdomain | DNS domain |
|
||||
| `vm_vcpus` | Integer | 2 | Number of virtual CPUs |
|
||||
| `vm_memory_mb` | Integer | 2048 | RAM in megabytes |
|
||||
| `vm_disk_size_gb` | Integer | 20 | Disk size in gigabytes |
|
||||
| `vm_network` | String | default | Libvirt network name |
|
||||
| `vm_bridge` | String | virbr0 | Bridge interface |
|
||||
| `ansible_user_ssh_key` | String | (preset) | SSH public key for ansible user |
|
||||
|
||||
## Distribution-Specific Configuration
|
||||
|
||||
### Debian/Ubuntu Systems
|
||||
|
||||
**Package Manager:** apt
|
||||
|
||||
**Cloud-Init Packages:**
|
||||
- sudo, vim, htop, tmux, curl, wget, rsync, git
|
||||
- python3, python3-pip, jq, bc
|
||||
- aide, auditd, chrony, ufw, lvm2
|
||||
- cloud-guest-utils, parted
|
||||
- unattended-upgrades, apt-listchanges
|
||||
|
||||
**Security Configuration:**
|
||||
- **Firewall:** ufw enabled, SSH allowed
|
||||
- **SSH:** Root login disabled, key-only auth
|
||||
- **Updates:** unattended-upgrades for security updates
|
||||
- **Audit:** auditd enabled
|
||||
- **Time Sync:** chrony configured
|
||||
|
||||
**User Management:**
|
||||
- ansible user → member of `sudo` group
|
||||
- Passwordless sudo access
|
||||
|
||||
**Automatic Updates Configuration:**
|
||||
```
|
||||
/etc/apt/apt.conf.d/50unattended-upgrades
|
||||
/etc/apt/apt.conf.d/20auto-upgrades
|
||||
```
|
||||
|
||||
**Post-Boot Commands:**
|
||||
```bash
|
||||
systemctl enable ssh && systemctl restart ssh
|
||||
systemctl enable chrony && systemctl start chrony
|
||||
ufw --force enable && ufw allow ssh
|
||||
systemctl enable auditd && systemctl start auditd
|
||||
growpart /dev/vda 1 && resize2fs /dev/vda1
|
||||
```
|
||||
|
||||
### RHEL Family Systems
|
||||
|
||||
**Package Manager:** dnf
|
||||
|
||||
**Cloud-Init Packages:**
|
||||
- sudo, vim, htop, tmux, curl, wget, rsync, git
|
||||
- python3, python3-pip, jq, bc
|
||||
- aide, audit, chrony, firewalld, lvm2
|
||||
- cloud-utils-growpart, gdisk
|
||||
- dnf-automatic
|
||||
- policycoreutils-python-utils
|
||||
|
||||
**Security Configuration:**
|
||||
- **Firewall:** firewalld enabled, SSH service allowed
|
||||
- **SELinux:** Enforcing mode
|
||||
- **SSH:** Root login disabled, key-only auth
|
||||
- **Updates:** dnf-automatic for security updates
|
||||
- **Audit:** auditd enabled
|
||||
- **Time Sync:** chronyd configured
|
||||
|
||||
**User Management:**
|
||||
- ansible user → member of `wheel` group
|
||||
- Passwordless sudo access
|
||||
|
||||
**SELinux Configuration:**
|
||||
```bash
|
||||
setenforce 1
|
||||
sed -i 's/^SELINUX=.*/SELINUX=enforcing/' /etc/selinux/config
|
||||
```
|
||||
|
||||
**Automatic Updates Configuration:**
|
||||
```
|
||||
/etc/dnf/automatic.conf
|
||||
upgrade_type = security
|
||||
apply_updates = yes
|
||||
```
|
||||
|
||||
**Post-Boot Commands:**
|
||||
```bash
|
||||
systemctl enable sshd && systemctl restart sshd
|
||||
systemctl enable chronyd && systemctl start chronyd
|
||||
systemctl enable firewalld && systemctl start firewalld
|
||||
firewall-cmd --permanent --add-service=ssh && firewall-cmd --reload
|
||||
systemctl enable auditd && systemctl start auditd
|
||||
systemctl enable dnf-automatic.timer && systemctl start dnf-automatic.timer
|
||||
setenforce 1
|
||||
growpart /dev/vda 1 && xfs_growfs /
|
||||
```
|
||||
|
||||
### SUSE Family Systems
|
||||
|
||||
**Package Manager:** zypper
|
||||
|
||||
**Cloud-Init Packages:**
|
||||
- sudo, vim, htop, tmux, curl, wget, rsync, git
|
||||
- python3, python3-pip, jq, bc
|
||||
- aide, audit, chrony, firewalld, lvm2
|
||||
- cloud-utils-growpart, gdisk
|
||||
|
||||
**Security Configuration:**
|
||||
- **Firewall:** firewalld enabled, SSH service allowed
|
||||
- **SSH:** Root login disabled, key-only auth
|
||||
- **Audit:** auditd enabled
|
||||
- **Time Sync:** chronyd configured
|
||||
|
||||
**User Management:**
|
||||
- ansible user → member of `wheel` group
|
||||
- Passwordless sudo access
|
||||
|
||||
**Post-Boot Commands:**
|
||||
```bash
|
||||
systemctl enable sshd && systemctl restart sshd
|
||||
systemctl enable chronyd && systemctl start chronyd
|
||||
systemctl enable firewalld && systemctl start firewalld
|
||||
firewall-cmd --permanent --add-service=ssh && firewall-cmd --reload
|
||||
systemctl enable auditd && systemctl start auditd
|
||||
growpart /dev/vda 1 && xfs_growfs / || resize2fs /dev/vda1 || btrfs filesystem resize max /
|
||||
```
|
||||
|
||||
## Security Features
|
||||
|
||||
### Universal Security Measures
|
||||
|
||||
All deployed VMs, regardless of distribution, include:
|
||||
|
||||
1. **User Security:**
|
||||
- Dedicated `ansible` service account
|
||||
- SSH key-based authentication only
|
||||
- Passwordless sudo (with logging)
|
||||
- Root SSH login disabled
|
||||
- Emergency console access available (password: ChangeMe123!)
|
||||
|
||||
2. **Network Security:**
|
||||
- Host-based firewall enabled and configured
|
||||
- SSH service allowed
|
||||
- Default deny policy for incoming traffic
|
||||
- Outgoing traffic allowed
|
||||
|
||||
3. **System Security:**
|
||||
- Audit daemon (auditd) enabled
|
||||
- Automatic security updates configured
|
||||
- Time synchronization enabled (chrony)
|
||||
- File integrity monitoring installed (AIDE)
|
||||
- Secure SSH configuration applied
|
||||
|
||||
4. **SSH Hardening:**
|
||||
```
|
||||
PermitRootLogin no
|
||||
PasswordAuthentication no
|
||||
PubkeyAuthentication yes
|
||||
MaxAuthTries 3
|
||||
MaxSessions 10
|
||||
ClientAliveInterval 300
|
||||
ClientAliveCountMax 2
|
||||
```
|
||||
|
||||
### Distribution-Specific Security
|
||||
|
||||
**RHEL Family:**
|
||||
- SELinux in enforcing mode
|
||||
- firewalld with rich rules support
|
||||
- dnf-automatic for security updates
|
||||
- Subscription management for certified packages (RHEL)
|
||||
|
||||
**Debian/Ubuntu:**
|
||||
- AppArmor profiles (Ubuntu)
|
||||
- UFW for simplified firewall management
|
||||
- unattended-upgrades for security updates
|
||||
- Automatic security patch installation
|
||||
|
||||
**SUSE Family:**
|
||||
- AppArmor support
|
||||
- firewalld with zones
|
||||
- YaST integration for security management
|
||||
|
||||
### Compliance Alignment
|
||||
|
||||
The deployment follows CLAUDE.md security principles:
|
||||
|
||||
✅ Principle of least privilege
|
||||
✅ Encryption in transit (SSH)
|
||||
✅ Key-based authentication
|
||||
✅ Automated security updates
|
||||
✅ System auditing enabled
|
||||
✅ Time synchronization
|
||||
✅ Firewall enabled by default
|
||||
✅ Regular security patching
|
||||
|
||||
## Post-Deployment
|
||||
|
||||
### Adding to Ansible Inventory
|
||||
|
||||
**Debian/Ubuntu VM:**
|
||||
```yaml
|
||||
debian_servers:
|
||||
hosts:
|
||||
debian12-vm:
|
||||
ansible_host: 192.168.122.X
|
||||
ansible_user: ansible
|
||||
ansible_ssh_common_args: '-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new'
|
||||
os_distribution: debian-12
|
||||
os_family: debian
|
||||
package_manager: apt
|
||||
```
|
||||
|
||||
**RHEL Family VM:**
|
||||
```yaml
|
||||
rhel_servers:
|
||||
hosts:
|
||||
rocky9-vm:
|
||||
ansible_host: 192.168.122.X
|
||||
ansible_user: ansible
|
||||
ansible_ssh_common_args: '-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new'
|
||||
os_distribution: rocky-9
|
||||
os_family: rhel
|
||||
package_manager: dnf
|
||||
selinux_mode: enforcing
|
||||
```
|
||||
|
||||
### Initial Configuration
|
||||
|
||||
After deployment, run configuration management:
|
||||
|
||||
```bash
|
||||
# Update system packages
|
||||
ansible <vm_name> -m package -a "name=* state=latest" -b
|
||||
|
||||
# Install additional packages
|
||||
ansible <vm_name> -m package -a "name=nginx state=present" -b
|
||||
|
||||
# Run configuration playbooks
|
||||
ansible-playbook -i inventories/development/hosts.yml \
|
||||
playbooks/configure-webserver.yml \
|
||||
-l <vm_name>
|
||||
```
|
||||
|
||||
### Verification Steps
|
||||
|
||||
1. **SSH Access:**
|
||||
```bash
|
||||
ssh -J grokbox ansible@<VM_IP>
|
||||
```
|
||||
|
||||
2. **Cloud-Init Status:**
|
||||
```bash
|
||||
cloud-init status --wait
|
||||
cloud-init status --long
|
||||
```
|
||||
|
||||
3. **System Information:**
|
||||
```bash
|
||||
cat /etc/os-release
|
||||
uname -r
|
||||
```
|
||||
|
||||
4. **Security Checks:**
|
||||
```bash
|
||||
# Firewall
|
||||
sudo ufw status verbose # Debian/Ubuntu
|
||||
sudo firewall-cmd --list-all # RHEL/SUSE
|
||||
|
||||
# SELinux (RHEL family)
|
||||
getenforce
|
||||
|
||||
# Audit daemon
|
||||
sudo systemctl status auditd
|
||||
|
||||
# Automatic updates
|
||||
sudo systemctl status unattended-upgrades # Debian/Ubuntu
|
||||
sudo systemctl status dnf-automatic.timer # RHEL
|
||||
```
|
||||
|
||||
5. **Disk and Memory:**
|
||||
```bash
|
||||
df -h
|
||||
free -h
|
||||
lsblk
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Distribution Selection Issues
|
||||
|
||||
**Problem:** Invalid distribution error
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# List supported distributions
|
||||
grep "^ " plays/deploy-linux-vm.yml | grep -E "debian-|ubuntu-|rhel-|centos-|rocky-|alma|sles-|opensuse"
|
||||
|
||||
# Use exact distribution identifier
|
||||
-e "os_distribution=debian-12" # Correct
|
||||
-e "os_distribution=debian" # Wrong
|
||||
```
|
||||
|
||||
### Cloud Image Download Failures
|
||||
|
||||
**Problem:** Image download fails or times out
|
||||
|
||||
**Causes:**
|
||||
- Network connectivity issues
|
||||
- Repository temporarily unavailable
|
||||
- Proxy configuration needed
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Test connectivity
|
||||
curl -I https://cloud.debian.org/
|
||||
curl -I https://cloud-images.ubuntu.com/
|
||||
|
||||
# Manual download
|
||||
cd /var/lib/libvirt/images
|
||||
wget <cloud_image_url>
|
||||
|
||||
# Configure proxy (if needed)
|
||||
export https_proxy=http://proxy:port
|
||||
```
|
||||
|
||||
### Checksum Verification Failures
|
||||
|
||||
**Problem:** Checksum verification fails
|
||||
|
||||
**Causes:**
|
||||
- Corrupt download
|
||||
- Mismatch between image and checksum file
|
||||
- Wrong checksum type
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Re-download image
|
||||
rm /var/lib/libvirt/images/<image-name>
|
||||
ansible-playbook plays/deploy-linux-vm.yml -e "os_distribution=..." -t download
|
||||
|
||||
# Verify manually
|
||||
cd /var/lib/libvirt/images
|
||||
sha256sum <image-name>
|
||||
# Compare with checksum file
|
||||
```
|
||||
|
||||
### VM Boot Issues
|
||||
|
||||
**Problem:** VM created but won't boot or get IP
|
||||
|
||||
**Causes:**
|
||||
- Cloud-init configuration error
|
||||
- Network misconfiguration
|
||||
- Insufficient resources
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Check VM status
|
||||
virsh list --all
|
||||
virsh dominfo <vm_name>
|
||||
|
||||
# View console
|
||||
virsh console <vm_name>
|
||||
|
||||
# Check cloud-init logs (via console)
|
||||
tail -f /var/log/cloud-init-output.log
|
||||
journalctl -u cloud-init
|
||||
|
||||
# Restart VM
|
||||
virsh destroy <vm_name>
|
||||
virsh start <vm_name>
|
||||
```
|
||||
|
||||
### SSH Connection Issues
|
||||
|
||||
**Problem:** Cannot SSH to deployed VM
|
||||
|
||||
**Causes:**
|
||||
- SSH key not configured correctly
|
||||
- Firewall blocking
|
||||
- cloud-init not completed
|
||||
- Wrong IP address
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Verify IP address
|
||||
virsh domifaddr <vm_name>
|
||||
|
||||
# Test connectivity
|
||||
ping <VM_IP>
|
||||
|
||||
# Check SSH service via console
|
||||
virsh console <vm_name>
|
||||
# Then: systemctl status ssh|sshd
|
||||
|
||||
# Verify firewall
|
||||
# Via console:
|
||||
sudo ufw status # Debian/Ubuntu
|
||||
sudo firewall-cmd --list-all # RHEL/SUSE
|
||||
|
||||
# Check cloud-init completion
|
||||
# Via console:
|
||||
cloud-init status --wait
|
||||
```
|
||||
|
||||
### SELinux Issues (RHEL Family)
|
||||
|
||||
**Problem:** Services failing due to SELinux denials
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Check SELinux status
|
||||
getenforce
|
||||
sestatus
|
||||
|
||||
# View denials
|
||||
sudo ausearch -m avc -ts recent
|
||||
|
||||
# Temporarily set to permissive (troubleshooting only)
|
||||
sudo setenforce 0
|
||||
|
||||
# Generate policy from denials
|
||||
sudo ausearch -m avc -ts recent | audit2allow -M myapp
|
||||
sudo semodule -i myapp.pp
|
||||
|
||||
# Re-enable enforcing
|
||||
sudo setenforce 1
|
||||
```
|
||||
|
||||
### Package Manager Issues
|
||||
|
||||
**Debian/Ubuntu:**
|
||||
```bash
|
||||
# Update package cache
|
||||
sudo apt update
|
||||
|
||||
# Fix broken packages
|
||||
sudo apt --fix-broken install
|
||||
|
||||
# Clear cache
|
||||
sudo apt clean
|
||||
```
|
||||
|
||||
**RHEL Family:**
|
||||
```bash
|
||||
# Update metadata
|
||||
sudo dnf makecache
|
||||
|
||||
# Check for problems
|
||||
sudo dnf check
|
||||
|
||||
# Clean cache
|
||||
sudo dnf clean all
|
||||
```
|
||||
|
||||
**SUSE:**
|
||||
```bash
|
||||
# Refresh repositories
|
||||
sudo zypper refresh
|
||||
|
||||
# Verify
|
||||
sudo zypper verify
|
||||
|
||||
# Clean cache
|
||||
sudo zypper clean
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Distribution Selection
|
||||
|
||||
1. **Use LTS versions for production:**
|
||||
- Ubuntu 22.04 LTS (support until 2027)
|
||||
- Ubuntu 24.04 LTS (support until 2029)
|
||||
- RHEL/Rocky/Alma 9 (support until 2032)
|
||||
|
||||
2. **Match distribution to workload:**
|
||||
- Web servers: Ubuntu, Debian
|
||||
- Enterprise applications: RHEL, Rocky Linux, AlmaLinux
|
||||
- Container hosts: CentOS Stream, Rocky Linux
|
||||
- Development: Ubuntu, Debian, openSUSE
|
||||
|
||||
3. **Consider support requirements:**
|
||||
- Commercial support: RHEL, SLES
|
||||
- Community support: CentOS Stream, Rocky Linux, AlmaLinux, Debian, Ubuntu, openSUSE
|
||||
|
||||
### Resource Allocation
|
||||
|
||||
**Minimum Requirements:**
|
||||
- 1 vCPU, 1GB RAM, 10GB disk (testing only)
|
||||
|
||||
**Recommended for Production:**
|
||||
- 2+ vCPUs, 2GB+ RAM, 20GB+ disk
|
||||
|
||||
**Workload-Specific:**
|
||||
```
|
||||
Web Server: 2-4 vCPUs, 4GB RAM, 40GB disk
|
||||
Database Server: 4-8 vCPUs, 16GB RAM, 100GB+ disk
|
||||
Application Server: 4-8 vCPUs, 8GB RAM, 80GB disk
|
||||
Container Host: 4-8 vCPUs, 16GB RAM, 80GB disk
|
||||
Development: 2-4 vCPUs, 8GB RAM, 50GB disk
|
||||
```
|
||||
|
||||
### Security Hardening
|
||||
|
||||
1. **Change default passwords immediately:**
|
||||
```bash
|
||||
sudo passwd root # Change from ChangeMe123!
|
||||
```
|
||||
|
||||
2. **Configure proper SSH keys:**
|
||||
- Use dedicated key per environment
|
||||
- Rotate keys regularly (90-180 days)
|
||||
- Use Ed25519 keys when possible
|
||||
|
||||
3. **Enable additional security features:**
|
||||
- CIS benchmarks scanning
|
||||
- Intrusion detection (fail2ban, OSSEC)
|
||||
- Log forwarding to SIEM
|
||||
- Vulnerability scanning
|
||||
|
||||
4. **Regular updates:**
|
||||
- Monitor automatic update logs
|
||||
- Schedule manual updates for major versions
|
||||
- Test updates in staging first
|
||||
|
||||
### Operational Excellence
|
||||
|
||||
1. **Naming Conventions:**
|
||||
- Use descriptive, meaningful VM names
|
||||
- Include purpose and environment: `web-prod-01`, `db-dev-01`
|
||||
- Document naming scheme
|
||||
|
||||
2. **Inventory Management:**
|
||||
- Keep Ansible inventory up-to-date
|
||||
- Document VM purpose and owner
|
||||
- Track VM lifecycle
|
||||
|
||||
3. **Monitoring:**
|
||||
- Set up monitoring for all VMs
|
||||
- Configure alerting for critical issues
|
||||
- Monitor resource usage trends
|
||||
|
||||
4. **Backup Strategy:**
|
||||
- Regular VM backups or disk snapshots
|
||||
- Test restore procedures
|
||||
- Document backup retention policy
|
||||
|
||||
5. **Documentation:**
|
||||
- Document VM purpose and configuration
|
||||
- Maintain runbooks for common tasks
|
||||
- Keep network diagrams current
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
1. **Disk I/O:**
|
||||
- Use virtio drivers (already configured)
|
||||
- Consider separate disk for databases
|
||||
- Use appropriate filesystem (xfs for RHEL, ext4 for Debian)
|
||||
|
||||
2. **Network:**
|
||||
- Use virtio network driver (already configured)
|
||||
- Consider SR-IOV for high-performance needs
|
||||
- Monitor network latency
|
||||
|
||||
3. **CPU:**
|
||||
- Right-size vCPU allocation
|
||||
- Avoid overcommitment on critical VMs
|
||||
- Use CPU pinning for performance-critical workloads
|
||||
|
||||
4. **Memory:**
|
||||
- Allocate sufficient RAM to avoid swapping
|
||||
- Monitor memory usage
|
||||
- Consider huge pages for databases
|
||||
|
||||
## References
|
||||
|
||||
- [CLAUDE.md](../CLAUDE.md) - Infrastructure guidelines
|
||||
- [Cheatsheet](../cheatsheets/deploy-linux-vm.md) - Quick reference
|
||||
- [Debian Cloud Images](https://cloud.debian.org/images/cloud/)
|
||||
- [Ubuntu Cloud Images](https://cloud-images.ubuntu.com/)
|
||||
- [CentOS Stream](https://www.centos.org/centos-stream/)
|
||||
- [Rocky Linux](https://rockylinux.org/)
|
||||
- [AlmaLinux](https://almalinux.org/)
|
||||
- [openSUSE](https://www.opensuse.org/)
|
||||
- [cloud-init Documentation](https://cloudinit.readthedocs.io/)
|
||||
- [libvirt Documentation](https://libvirt.org/docs.html)
|
||||
|
||||
---
|
||||
|
||||
**Document Version**: 1.0
|
||||
**Last Updated**: 2025-11-10
|
||||
**Maintained By**: Ansible Infrastructure Team
|
||||
Reference in New Issue
Block a user