Add comprehensive documentation

- Add linux-vm-deployment.md with complete deployment guide
  - Architecture overview and security model
  - Supported distributions matrix
  - LVM partitioning specifications
  - Distribution-specific configurations
  - Troubleshooting procedures
  - Performance tuning guidelines
This commit is contained in:
Infrastructure Team
2025-11-10 22:52:03 +01:00
parent 82796a18e4
commit 04a381e0d5
3 changed files with 2188 additions and 0 deletions

View File

@@ -0,0 +1,728 @@
# Debian 12 VM Deployment Documentation
## Overview
This document describes the automated deployment process for Debian 12 virtual machines on the grokbox KVM/libvirt hypervisor. The deployment uses cloud-init for unattended configuration and follows the security-first principles outlined in CLAUDE.md.
## Table of Contents
1. [Architecture](#architecture)
2. [Prerequisites](#prerequisites)
3. [Deployment Process](#deployment-process)
4. [Configuration](#configuration)
5. [Security Features](#security-features)
6. [Post-Deployment](#post-deployment)
7. [Troubleshooting](#troubleshooting)
8. [Maintenance](#maintenance)
## Architecture
### Infrastructure Components
```
┌─────────────────────────────────────────────┐
│ grokbox (KVM Hypervisor) │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ libvirt/QEMU │ │
│ │ │ │
│ │ ┌────────────────────────────────┐ │ │
│ │ │ Debian 12 Guest VM │ │ │
│ │ │ │ │ │
│ │ │ - 2 vCPUs / 2GB RAM │ │ │
│ │ │ - 20GB qcow2 disk │ │ │
│ │ │ - cloud-init configured │ │ │
│ │ │ - ansible user ready │ │ │
│ │ │ - Security hardened │ │ │
│ │ └────────────────────────────────┘ │ │
│ │ │ │
│ │ Network: virbr0 (192.168.122.0/24) │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
```
### Deployment Workflow
```
[Ansible Control Node]
│ 1. SSH to grokbox
[grokbox hypervisor]
│ 2. Download Debian cloud image
├─ 3. Verify checksums
├─ 4. Create VM disk (qcow2)
├─ 5. Generate cloud-init ISO
├─ 6. Create VM with virt-install
│ 7. VM boots with cloud-init
[Debian 12 VM]
├─ 8. Create ansible user
├─ 9. Configure SSH
├─ 10. Install packages
├─ 11. Security hardening
└─ 12. System ready
```
## Prerequisites
### Hypervisor Requirements
On **grokbox**, ensure the following are present:
1. **Virtualization Support**
```bash
# Verify CPU virtualization
egrep -c '(vmx|svm)' /proc/cpuinfo # Should be > 0
# Verify KVM module loaded
lsmod | grep kvm
```
2. **Required Packages**
- libvirt-daemon-system
- libvirt-clients
- virtinst
- qemu-kvm
- qemu-utils
- cloud-image-utils
- genisoimage
- python3-libvirt
3. **Sufficient Resources**
- Storage: ~25GB available in `/var/lib/libvirt/images/`
- Memory: Enough free RAM for VM allocation
- Network: libvirt default network configured
4. **libvirtd Service Running**
```bash
systemctl status libvirtd
```
### Ansible Control Node Requirements
1. Ansible 2.9 or newer
2. SSH access to grokbox hypervisor
3. SSH key configured for grok user
4. Python 3.x installed
### Network Requirements
- Connectivity to Debian cloud image repository
- DNS resolution working
- Default libvirt network (virbr0) configured and active
## Deployment Process
### Step 1: Pre-flight Checks
The playbook performs the following validations:
- **VM Name Uniqueness**: Ensures no VM with the same name exists
- **Virtualization Support**: Validates QEMU/KVM capabilities
- **Package Installation**: Installs required tools if missing
- **Service Status**: Verifies libvirtd is running
### Step 2: Image Management
#### Download Debian Cloud Image
- **Source**: https://cloud.debian.org/images/cloud/bookworm/latest/
- **Image**: debian-12-generic-amd64.qcow2
- **Cache Location**: `/var/lib/libvirt/images/debian-12-generic-amd64.qcow2`
- **Checksum Verification**: SHA512SUMS validated
The base image is downloaded once and cached for subsequent deployments.
#### Create VM Disk
A new copy-on-write (CoW) disk is created using qemu-img:
```bash
qemu-img create -f qcow2 \
-F qcow2 \
-b /var/lib/libvirt/images/debian-12-generic-amd64.qcow2 \
/var/lib/libvirt/images/debian12-guest.qcow2 \
20G
```
This creates a thin-provisioned disk backed by the cloud image.
### Step 3: Cloud-Init Configuration
Two configuration files are generated:
#### meta-data
```yaml
instance-id: debian12-guest
local-hostname: debian12
```
#### user-data
Comprehensive cloud-init configuration including:
- **User Management**: Creates ansible user with SSH keys
- **Security Configuration**: SSH hardening, firewall setup
- **Package Installation**: Essential and security packages
- **System Configuration**: Time sync, locale, timezone
- **Automatic Updates**: Unattended security upgrades
#### ISO Generation
The configuration files are packaged into a bootable ISO:
```bash
genisoimage -output debian12-guest-cloud-init.iso \
-volid cidata -joliet -rock \
user-data meta-data
```
### Step 4: VM Creation
VM is created using virt-install:
```bash
virt-install \
--name debian12-guest \
--memory 2048 \
--vcpus 2 \
--disk path=/var/lib/libvirt/images/debian12-guest.qcow2,format=qcow2,bus=virtio \
--disk path=/var/lib/libvirt/images/debian12-guest-cloud-init.iso,device=cdrom \
--network network=default,model=virtio \
--os-variant debian11 \
--graphics none \
--console pty,target_type=serial \
--import \
--noautoconsole
```
### Step 5: Boot and Initialization
1. **VM Boots**: Starts from the qcow2 disk
2. **Cloud-Init Runs**: Reads configuration from ISO
3. **System Configuration**: Applies all settings
4. **Network Configuration**: Obtains IP via DHCP
5. **Package Updates**: Downloads and installs updates
6. **Service Initialization**: Starts all configured services
**Typical boot time**: 60-90 seconds
### Step 6: Validation
The playbook validates:
- VM is running and accessible
- IP address assigned
- SSH port (22) accepting connections
- cloud-init completed successfully
- System resources available
### Step 7: Post-Deployment Configuration
Optional second play that:
- Waits for cloud-init completion
- Gathers system facts
- Displays system information
- Validates disk and memory usage
## Configuration
### Default Configuration
```yaml
# VM Specifications
vm_name: "debian12-guest"
vm_hostname: "debian12"
vm_domain: "localdomain"
vm_vcpus: 2
vm_memory_mb: 2048
vm_disk_size_gb: 20
# Network
vm_network: "default"
vm_bridge: "virbr0"
# Storage
vm_disk_path: "/var/lib/libvirt/images/{{ vm_name }}.qcow2"
cloud_init_iso_path: "/var/lib/libvirt/images/{{ vm_name }}-cloud-init.iso"
```
### Customization Examples
#### High-Performance VM
```bash
ansible-playbook plays/deploy-debian12-vm.yml \
-e "vm_name=app-server" \
-e "vm_vcpus=8" \
-e "vm_memory_mb=16384" \
-e "vm_disk_size_gb=100"
```
#### Development VM
```bash
ansible-playbook plays/deploy-debian12-vm.yml \
-e "vm_name=dev-workstation" \
-e "vm_vcpus=4" \
-e "vm_memory_mb=8192" \
-e "vm_disk_size_gb=50" \
-e "vm_hostname=devbox" \
-e "vm_domain=dev.local"
```
#### Custom SSH Key
```bash
ansible-playbook plays/deploy-debian12-vm.yml \
-e "vm_name=secure-vm" \
-e "ansible_user_ssh_key='ssh-ed25519 AAAA...'"
```
### Variable Precedence
Variables can be set in order of precedence:
1. **Command-line** (`-e` flag) - Highest
2. **Playbook vars section**
3. **Inventory host_vars**
4. **Inventory group_vars**
5. **Defaults in playbook** - Lowest
## Security Features
### User Management
- **ansible user**: Non-root service account
- Passwordless sudo access
- SSH key authentication only
- Member of sudo group
- Home directory: `/home/ansible`
- **root user**: Console access only
- SSH login disabled
- Password set for emergency console access
- Remote access blocked
### SSH Hardening
Configuration in `/etc/ssh/sshd_config.d/99-security.conf`:
```
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
MaxSessions 10
ClientAliveInterval 300
ClientAliveCountMax 2
```
### Firewall Configuration
- **UFW (Uncomplicated Firewall)** enabled by default
- Default policy: deny incoming, allow outgoing
- SSH (port 22) allowed
- Additional rules can be added post-deployment
### Automatic Security Updates
Unattended-upgrades configured for:
- Automatic installation of security updates
- Daily update checks
- Automatic cleanup of old kernels
- Email notifications (if configured)
- **No automatic reboot** (requires manual intervention)
### Audit and Monitoring
- **auditd**: System call auditing enabled
- **aide**: File integrity monitoring installed
- **chrony**: Time synchronization configured
- **Logging**: All cloud-init output logged
### Compliance Features
Aligned with CLAUDE.md security requirements:
- ✅ Principle of least privilege
- ✅ Encryption in transit (SSH)
- ✅ Key-based authentication
- ✅ Automated security updates
- ✅ System auditing enabled
- ✅ Time synchronization
- ✅ Firewall enabled by default
## Post-Deployment
### Adding to Inventory
Update your Ansible inventory:
```yaml
# inventories/development/hosts.yml
kvm_guests:
children:
application_servers:
hosts:
debian12-guest:
ansible_host: 192.168.122.X
ansible_user: ansible
ansible_ssh_common_args: '-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new'
ansible_python_interpreter: /usr/bin/python3
host_description: "Application Server - Debian 12"
host_role: application
host_type: virtual_machine
hypervisor: grokbox
vm_vcpus: 2
vm_memory_mb: 2048
autostart: true
```
### Initial Access
```bash
# Get VM IP address
ssh grokbox "virsh domifaddr debian12-guest"
# SSH to VM via ProxyJump
ssh -J grokbox ansible@192.168.122.X
# Or add to ~/.ssh/config
Host debian12-guest
HostName 192.168.122.X
User ansible
ProxyJump grokbox
StrictHostKeyChecking accept-new
```
### Configuration Management
Run additional roles or playbooks:
```bash
# Example: Configure web server
ansible-playbook -i inventories/development/hosts.yml \
playbooks/configure-webserver.yml \
-l debian12-guest
# Example: Security hardening
ansible-playbook -i inventories/development/hosts.yml \
playbooks/security-hardening.yml \
-l debian12-guest
```
### VM Management Commands
```bash
# Start VM
virsh start debian12-guest
# Shutdown VM gracefully
virsh shutdown debian12-guest
# Force shutdown
virsh destroy debian12-guest
# Reboot VM
virsh reboot debian12-guest
# Enable autostart
virsh autostart debian12-guest
# Disable autostart
virsh autostart debian12-guest --disable
# VM status
virsh dominfo debian12-guest
# VM resource usage
virsh domstats debian12-guest
# Console access
virsh console debian12-guest
```
## Troubleshooting
### Common Issues
#### 1. VM Already Exists
**Error**: VM with name already exists
**Solution**:
```bash
# Check existing VMs
virsh list --all
# Remove existing VM
virsh destroy debian12-guest # if running
virsh undefine debian12-guest --remove-all-storage
```
#### 2. Image Download Fails
**Error**: Failed to download cloud image
**Causes**:
- Network connectivity issues
- Proxy configuration
- DNS resolution problems
**Solution**:
```bash
# Test connectivity
curl -I https://cloud.debian.org
# Manual download
cd /var/lib/libvirt/images
wget https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2
# Re-run playbook
ansible-playbook plays/deploy-debian12-vm.yml -t deploy
```
#### 3. VM Won't Get IP Address
**Error**: IP address not assigned after 10 retries
**Causes**:
- DHCP server not running
- Network misconfiguration
- VM network interface issues
**Solution**:
```bash
# Check libvirt network
virsh net-list --all
virsh net-info default
virsh net-start default # if not started
# Check VM network interface
virsh domiflist debian12-guest
# Check DHCP leases
virsh net-dhcp-leases default
# Access console to troubleshoot
virsh console debian12-guest
# Check: ip addr, systemctl status networking
```
#### 4. SSH Connection Failed
**Error**: SSH connection timeout or refused
**Causes**:
- SSH service not started
- Firewall blocking
- Wrong IP address
- cloud-init not completed
**Solution**:
```bash
# Verify VM is running
virsh list
# Check cloud-init status via console
virsh console debian12-guest
# Run: cloud-init status --wait
# Check SSH service
# Via console: systemctl status ssh
# Check firewall
# Via console: ufw status
# Verify SSH key
ssh-add -l
```
#### 5. Insufficient Resources
**Error**: Failed to allocate memory or storage
**Solution**:
```bash
# Check available resources
free -h
df -h /var/lib/libvirt/images/
# Adjust VM resources
ansible-playbook plays/deploy-debian12-vm.yml \
-e "vm_memory_mb=1024" \
-e "vm_disk_size_gb=10"
```
### Debug Mode
Enable verbose logging:
```bash
# Ansible verbose mode
ansible-playbook plays/deploy-debian12-vm.yml -vvv
# Check cloud-init logs on VM
virsh console debian12-guest
# Then: tail -f /var/log/cloud-init-output.log
# Check libvirt logs
journalctl -u libvirtd -f
```
### Health Checks
```bash
# Verify VM health
virsh dominfo debian12-guest
virsh domstats debian12-guest
# Network connectivity
ping $(virsh domifaddr debian12-guest | grep -oP '(\d{1,3}\.){3}\d{1,3}' | head -1)
# SSH connectivity
ssh -J grokbox ansible@$(virsh domifaddr debian12-guest | grep -oP '(\d{1,3}\.){3}\d{1,3}' | head -1) "echo 'VM is accessible'"
```
## Maintenance
### Updating the Base Image
Periodically update the cached Debian cloud image:
```bash
# Remove old image
ssh grokbox "rm /var/lib/libvirt/images/debian-12-generic-amd64.qcow2"
# Download latest
ansible-playbook plays/deploy-debian12-vm.yml -t download,verify
```
### VM Snapshots
Create snapshots before major changes:
```bash
# Create snapshot
virsh snapshot-create-as debian12-guest \
snapshot1 \
"Before application deployment"
# List snapshots
virsh snapshot-list debian12-guest
# Revert to snapshot
virsh snapshot-revert debian12-guest snapshot1
# Delete snapshot
virsh snapshot-delete debian12-guest snapshot1
```
### Backup and Restore
#### Backup VM
```bash
# Stop VM
virsh shutdown debian12-guest
# Backup disk
cp /var/lib/libvirt/images/debian12-guest.qcow2 \
/backup/debian12-guest-$(date +%Y%m%d).qcow2
# Backup XML config
virsh dumpxml debian12-guest > /backup/debian12-guest.xml
# Start VM
virsh start debian12-guest
```
#### Restore VM
```bash
# Copy disk back
cp /backup/debian12-guest-20241110.qcow2 \
/var/lib/libvirt/images/debian12-guest.qcow2
# Define VM from XML
virsh define /backup/debian12-guest.xml
# Start VM
virsh start debian12-guest
```
### Resize VM Disk
```bash
# Shutdown VM
virsh shutdown debian12-guest
# Resize disk
qemu-img resize /var/lib/libvirt/images/debian12-guest.qcow2 +10G
# Start VM
virsh start debian12-guest
# On VM: resize partition and filesystem
growpart /dev/vda 1
resize2fs /dev/vda1
```
### Resource Adjustment
Modify VM resources:
```bash
# Set maximum memory (requires shutdown)
virsh setmaxmem debian12-guest 4194304 --config
# Set current memory (can be done live)
virsh setmem debian12-guest 4194304
# Set vCPUs (requires shutdown)
virsh setvcpus debian12-guest 4 --config --maximum
virsh setvcpus debian12-guest 4 --config
```
## Best Practices
1. **Naming Convention**: Use descriptive VM names indicating purpose
2. **Resource Planning**: Right-size VMs to avoid waste
3. **Documentation**: Document VM purpose and configuration
4. **Monitoring**: Set up monitoring for critical VMs
5. **Backups**: Regular backups of important VMs
6. **Updates**: Keep VMs updated with security patches
7. **Inventory**: Maintain accurate Ansible inventory
8. **Tags**: Use libvirt tags for organization
9. **Networking**: Use appropriate network isolation
10. **Testing**: Test deployment process in development first
## References
- [CLAUDE.md](../CLAUDE.md) - Infrastructure guidelines
- [Cheatsheet](../cheatsheets/deploy-debian12-vm.md) - Quick reference
- [Debian Cloud Images](https://cloud.debian.org/images/cloud/)
- [cloud-init Documentation](https://cloudinit.readthedocs.io/)
- [libvirt Documentation](https://libvirt.org/docs.html)
- [virt-install man page](https://linux.die.net/man/1/virt-install)
## Support and Contact
For issues or questions:
1. Check troubleshooting section above
2. Review cloud-init logs: `/var/log/cloud-init.log`
3. Review libvirt logs: `journalctl -u libvirtd`
4. Consult Ansible playbook: `plays/deploy-debian12-vm.yml`
---
**Document Version**: 1.0
**Last Updated**: 2025-11-10
**Maintained By**: Ansible Infrastructure Team