Add comprehensive documentation
- Add linux-vm-deployment.md with complete deployment guide - Architecture overview and security model - Supported distributions matrix - LVM partitioning specifications - Distribution-specific configurations - Troubleshooting procedures - Performance tuning guidelines
This commit is contained in:
944
docs/linux-vm-deployment.md
Normal file
944
docs/linux-vm-deployment.md
Normal file
@@ -0,0 +1,944 @@
|
||||
# Multi-Distribution Linux VM Deployment Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the automated deployment process for multiple Linux distributions on KVM/libvirt hypervisors. The deployment supports major server distributions including Debian, Ubuntu, RHEL, CentOS Stream, Rocky Linux, AlmaLinux, SLES, and openSUSE Leap.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Supported Distributions](#supported-distributions)
|
||||
2. [Architecture](#architecture)
|
||||
3. [Prerequisites](#prerequisites)
|
||||
4. [Cloud Image Sources](#cloud-image-sources)
|
||||
5. [Deployment Process](#deployment-process)
|
||||
6. [Distribution-Specific Configuration](#distribution-specific-configuration)
|
||||
7. [Security Features](#security-features)
|
||||
8. [Post-Deployment](#post-deployment)
|
||||
9. [Troubleshooting](#troubleshooting)
|
||||
10. [Best Practices](#best-practices)
|
||||
|
||||
## Supported Distributions
|
||||
|
||||
### Debian Family
|
||||
|
||||
| Distribution | Version | Package Manager | Firewall | Cloud Image |
|
||||
|--------------|---------|----------------|----------|-------------|
|
||||
| Debian | 11 (Bullseye) | apt | ufw | ✅ Auto-download |
|
||||
| Debian | 12 (Bookworm) | apt | ufw | ✅ Auto-download |
|
||||
| Ubuntu | 20.04 LTS (Focal) | apt | ufw | ✅ Auto-download |
|
||||
| Ubuntu | 22.04 LTS (Jammy) | apt | ufw | ✅ Auto-download |
|
||||
| Ubuntu | 24.04 LTS (Noble) | apt | ufw | ✅ Auto-download |
|
||||
|
||||
### RHEL Family
|
||||
|
||||
| Distribution | Version | Package Manager | Firewall | SELinux | Cloud Image |
|
||||
|--------------|---------|----------------|----------|---------|-------------|
|
||||
| RHEL | 8, 9 | dnf | firewalld | Enforcing | ⚠️ Manual download |
|
||||
| CentOS Stream | 8, 9 | dnf | firewalld | Enforcing | ✅ Auto-download |
|
||||
| Rocky Linux | 8, 9 | dnf | firewalld | Enforcing | ✅ Auto-download |
|
||||
| AlmaLinux | 8, 9 | dnf | firewalld | Enforcing | ✅ Auto-download |
|
||||
|
||||
### SUSE Family
|
||||
|
||||
| Distribution | Version | Package Manager | Firewall | Cloud Image |
|
||||
|--------------|---------|----------------|----------|-------------|
|
||||
| SLES | 15 | zypper | firewalld | ⚠️ Manual download |
|
||||
| openSUSE Leap | 15.5, 15.6 | zypper | firewalld | ✅ Auto-download |
|
||||
|
||||
**Legend:**
|
||||
- ✅ = Automatically downloaded from official repositories
|
||||
- ⚠️ = Requires subscription and manual download
|
||||
|
||||
## Architecture
|
||||
|
||||
### Multi-Distribution Support Design
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Ansible Control Node │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────┐ │
|
||||
│ │ deploy-linux-vm.yml │ │
|
||||
│ │ │ │
|
||||
│ │ • Distribution Selection Logic │ │
|
||||
│ │ • Cloud Image Repository Map │ │
|
||||
│ │ • OS Family Detection │ │
|
||||
│ │ • Package Manager Adaptation │ │
|
||||
│ └────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
│
|
||||
│ SSH
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ KVM Hypervisor (grokbox) │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ Cloud Image Cache │ │
|
||||
│ │ /var/lib/libvirt/images/ │ │
|
||||
│ │ ├─ debian-12-*.qcow2 │ │
|
||||
│ │ ├─ ubuntu-22.04-*.img │ │
|
||||
│ │ ├─ centos-stream-9-*.qcow2 │ │
|
||||
│ │ ├─ rocky-9-*.qcow2 │ │
|
||||
│ │ └─ almalinux-9-*.qcow2 │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ libvirt/QEMU │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ Debian VM │ │ Ubuntu VM │ │ │
|
||||
│ │ │ ufw enabled │ │ ufw enabled │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ Rocky VM │ │ Alma VM │ │ │
|
||||
│ │ │ SELinux=Enf │ │ SELinux=Enf │ │ │
|
||||
│ │ │ firewalld │ │ firewalld │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Deployment Workflow
|
||||
|
||||
```
|
||||
[Start]
|
||||
│
|
||||
▼
|
||||
[Validate Distribution Selection]
|
||||
│
|
||||
├─ Check distribution in supported list
|
||||
├─ Set distribution facts (family, package manager, etc.)
|
||||
└─ Display deployment configuration
|
||||
│
|
||||
▼
|
||||
[Pre-flight Checks]
|
||||
│
|
||||
├─ Verify VM doesn't already exist
|
||||
├─ Validate virtualization support
|
||||
└─ Install required packages on hypervisor
|
||||
│
|
||||
▼
|
||||
[Download Cloud Image]
|
||||
│
|
||||
├─ Check if image cached
|
||||
├─ Download from official repository
|
||||
└─ Verify checksum (SHA256/SHA512)
|
||||
│
|
||||
▼
|
||||
[Create VM Storage]
|
||||
│
|
||||
├─ Create qcow2 disk (CoW from base image)
|
||||
└─ Set proper permissions (libvirt-qemu/qemu)
|
||||
│
|
||||
▼
|
||||
[Generate Cloud-Init Configuration]
|
||||
│
|
||||
├─ Select template based on OS family:
|
||||
│ ├─ Debian/Ubuntu → apt, ufw, unattended-upgrades
|
||||
│ ├─ RHEL family → dnf, firewalld, SELinux, dnf-automatic
|
||||
│ └─ SUSE family → zypper, firewalld
|
||||
│
|
||||
├─ Create meta-data (hostname, instance-id)
|
||||
├─ Create user-data (users, packages, security)
|
||||
└─ Generate cloud-init ISO
|
||||
│
|
||||
▼
|
||||
[Deploy VM]
|
||||
│
|
||||
├─ Run virt-install with appropriate os-variant
|
||||
├─ Attach disk and cloud-init ISO
|
||||
└─ Start VM
|
||||
│
|
||||
▼
|
||||
[Wait for Boot]
|
||||
│
|
||||
├─ VM boots from qcow2 disk
|
||||
├─ Cloud-init runs configuration
|
||||
├─ Network configured via DHCP
|
||||
└─ Get IP address from libvirt
|
||||
│
|
||||
▼
|
||||
[Validation]
|
||||
│
|
||||
├─ Test SSH connectivity
|
||||
├─ Verify cloud-init completion
|
||||
├─ Display VM information
|
||||
└─ System health checks
|
||||
│
|
||||
▼
|
||||
[Cleanup]
|
||||
│
|
||||
└─ Remove temporary files
|
||||
│
|
||||
▼
|
||||
[Complete]
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Hypervisor Requirements
|
||||
|
||||
**Hardware:**
|
||||
- CPU with virtualization extensions (Intel VT-x or AMD-V)
|
||||
- Sufficient RAM for host + guest VMs
|
||||
- Adequate storage for cloud images and VM disks
|
||||
|
||||
**Software:**
|
||||
|
||||
**For Debian/Ubuntu hypervisors:**
|
||||
```bash
|
||||
apt install -y \
|
||||
libvirt-daemon-system \
|
||||
libvirt-clients \
|
||||
virtinst \
|
||||
qemu-kvm \
|
||||
qemu-utils \
|
||||
cloud-image-utils \
|
||||
genisoimage \
|
||||
python3-libvirt
|
||||
```
|
||||
|
||||
**For RHEL/CentOS/Rocky/Alma hypervisors:**
|
||||
```bash
|
||||
dnf install -y \
|
||||
libvirt \
|
||||
libvirt-client \
|
||||
virt-install \
|
||||
qemu-kvm \
|
||||
qemu-img \
|
||||
cloud-utils \
|
||||
genisoimage \
|
||||
python3-libvirt
|
||||
```
|
||||
|
||||
**Services:**
|
||||
```bash
|
||||
systemctl enable --now libvirtd
|
||||
```
|
||||
|
||||
### Network Requirements
|
||||
|
||||
- Internet connectivity for cloud image downloads
|
||||
- DNS resolution working
|
||||
- libvirt default network active:
|
||||
```bash
|
||||
virsh net-list
|
||||
virsh net-start default # if not started
|
||||
virsh net-autostart default
|
||||
```
|
||||
|
||||
### Ansible Control Node
|
||||
|
||||
- Ansible 2.9 or newer
|
||||
- SSH access to hypervisor
|
||||
- Python 3.x installed
|
||||
|
||||
## Cloud Image Sources
|
||||
|
||||
### Official Repositories
|
||||
|
||||
**Debian:**
|
||||
- URL: https://cloud.debian.org/images/cloud/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA512SUMS provided
|
||||
- Update Frequency: Regular (stable releases)
|
||||
|
||||
**Ubuntu:**
|
||||
- URL: https://cloud-images.ubuntu.com/
|
||||
- Format: img (qcow2 compatible)
|
||||
- Checksum: SHA256SUMS provided
|
||||
- Update Frequency: Daily builds available
|
||||
|
||||
**CentOS Stream:**
|
||||
- URL: https://cloud.centos.org/centos/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 CHECKSUM file
|
||||
- Update Frequency: Regular updates
|
||||
|
||||
**Rocky Linux:**
|
||||
- URL: https://download.rockylinux.org/pub/rocky/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 CHECKSUM file
|
||||
- Update Frequency: Regular with point releases
|
||||
|
||||
**AlmaLinux:**
|
||||
- URL: https://repo.almalinux.org/almalinux/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 CHECKSUM file
|
||||
- Update Frequency: Regular with point releases
|
||||
|
||||
**openSUSE Leap:**
|
||||
- URL: https://download.opensuse.org/distribution/leap/
|
||||
- Format: qcow2
|
||||
- Checksum: SHA256 per-file
|
||||
- Update Frequency: Per release cycle
|
||||
|
||||
### Manual Download Required
|
||||
|
||||
**Red Hat Enterprise Linux (RHEL):**
|
||||
- Requires: Red Hat subscription
|
||||
- Portal: https://access.redhat.com/downloads/
|
||||
- Steps:
|
||||
1. Log in to Red Hat Customer Portal
|
||||
2. Navigate to Downloads
|
||||
3. Select "Red Hat Enterprise Linux"
|
||||
4. Download KVM Guest Image
|
||||
5. Place at: `/var/lib/libvirt/images/rhel-X-x86_64-kvm.qcow2`
|
||||
|
||||
**SUSE Linux Enterprise Server (SLES):**
|
||||
- Requires: SUSE subscription
|
||||
- Portal: https://scc.suse.com/
|
||||
- Steps:
|
||||
1. Log in to SUSE Customer Center
|
||||
2. Download cloud image for SLES 15
|
||||
3. Place at: `/var/lib/libvirt/images/sles-15-genericcloud-amd64.qcow2`
|
||||
|
||||
## Deployment Process
|
||||
|
||||
### Basic Deployment
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-linux-vm.yml \
|
||||
-e "os_distribution=<distro-version>" \
|
||||
-e "vm_name=<name>"
|
||||
```
|
||||
|
||||
### Distribution Selection
|
||||
|
||||
The `os_distribution` variable determines which Linux distribution to deploy. Format: `distro-version`
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
# Debian
|
||||
-e "os_distribution=debian-12"
|
||||
|
||||
# Ubuntu
|
||||
-e "os_distribution=ubuntu-22.04"
|
||||
|
||||
# CentOS Stream
|
||||
-e "os_distribution=centos-stream-9"
|
||||
|
||||
# Rocky Linux
|
||||
-e "os_distribution=rocky-9"
|
||||
|
||||
# AlmaLinux
|
||||
-e "os_distribution=almalinux-9"
|
||||
|
||||
# openSUSE
|
||||
-e "os_distribution=opensuse-leap-15.6"
|
||||
```
|
||||
|
||||
### Resource Customization
|
||||
|
||||
```bash
|
||||
ansible-playbook plays/deploy-linux-vm.yml \
|
||||
-e "os_distribution=rocky-9" \
|
||||
-e "vm_name=database-server" \
|
||||
-e "vm_hostname=db01" \
|
||||
-e "vm_domain=production.local" \
|
||||
-e "vm_vcpus=8" \
|
||||
-e "vm_memory_mb=16384" \
|
||||
-e "vm_disk_size_gb=200"
|
||||
```
|
||||
|
||||
### Configuration Variables
|
||||
|
||||
| Variable | Type | Default | Description |
|
||||
|----------|------|---------|-------------|
|
||||
| `os_distribution` | **Required** | debian-12 | Distribution identifier |
|
||||
| `vm_name` | String | linux-guest | VM name in libvirt |
|
||||
| `vm_hostname` | String | linux-vm | Guest hostname |
|
||||
| `vm_domain` | String | localdomain | DNS domain |
|
||||
| `vm_vcpus` | Integer | 2 | Number of virtual CPUs |
|
||||
| `vm_memory_mb` | Integer | 2048 | RAM in megabytes |
|
||||
| `vm_disk_size_gb` | Integer | 20 | Disk size in gigabytes |
|
||||
| `vm_network` | String | default | Libvirt network name |
|
||||
| `vm_bridge` | String | virbr0 | Bridge interface |
|
||||
| `ansible_user_ssh_key` | String | (preset) | SSH public key for ansible user |
|
||||
|
||||
## Distribution-Specific Configuration
|
||||
|
||||
### Debian/Ubuntu Systems
|
||||
|
||||
**Package Manager:** apt
|
||||
|
||||
**Cloud-Init Packages:**
|
||||
- sudo, vim, htop, tmux, curl, wget, rsync, git
|
||||
- python3, python3-pip, jq, bc
|
||||
- aide, auditd, chrony, ufw, lvm2
|
||||
- cloud-guest-utils, parted
|
||||
- unattended-upgrades, apt-listchanges
|
||||
|
||||
**Security Configuration:**
|
||||
- **Firewall:** ufw enabled, SSH allowed
|
||||
- **SSH:** Root login disabled, key-only auth
|
||||
- **Updates:** unattended-upgrades for security updates
|
||||
- **Audit:** auditd enabled
|
||||
- **Time Sync:** chrony configured
|
||||
|
||||
**User Management:**
|
||||
- ansible user → member of `sudo` group
|
||||
- Passwordless sudo access
|
||||
|
||||
**Automatic Updates Configuration:**
|
||||
```
|
||||
/etc/apt/apt.conf.d/50unattended-upgrades
|
||||
/etc/apt/apt.conf.d/20auto-upgrades
|
||||
```
|
||||
|
||||
**Post-Boot Commands:**
|
||||
```bash
|
||||
systemctl enable ssh && systemctl restart ssh
|
||||
systemctl enable chrony && systemctl start chrony
|
||||
ufw --force enable && ufw allow ssh
|
||||
systemctl enable auditd && systemctl start auditd
|
||||
growpart /dev/vda 1 && resize2fs /dev/vda1
|
||||
```
|
||||
|
||||
### RHEL Family Systems
|
||||
|
||||
**Package Manager:** dnf
|
||||
|
||||
**Cloud-Init Packages:**
|
||||
- sudo, vim, htop, tmux, curl, wget, rsync, git
|
||||
- python3, python3-pip, jq, bc
|
||||
- aide, audit, chrony, firewalld, lvm2
|
||||
- cloud-utils-growpart, gdisk
|
||||
- dnf-automatic
|
||||
- policycoreutils-python-utils
|
||||
|
||||
**Security Configuration:**
|
||||
- **Firewall:** firewalld enabled, SSH service allowed
|
||||
- **SELinux:** Enforcing mode
|
||||
- **SSH:** Root login disabled, key-only auth
|
||||
- **Updates:** dnf-automatic for security updates
|
||||
- **Audit:** auditd enabled
|
||||
- **Time Sync:** chronyd configured
|
||||
|
||||
**User Management:**
|
||||
- ansible user → member of `wheel` group
|
||||
- Passwordless sudo access
|
||||
|
||||
**SELinux Configuration:**
|
||||
```bash
|
||||
setenforce 1
|
||||
sed -i 's/^SELINUX=.*/SELINUX=enforcing/' /etc/selinux/config
|
||||
```
|
||||
|
||||
**Automatic Updates Configuration:**
|
||||
```
|
||||
/etc/dnf/automatic.conf
|
||||
upgrade_type = security
|
||||
apply_updates = yes
|
||||
```
|
||||
|
||||
**Post-Boot Commands:**
|
||||
```bash
|
||||
systemctl enable sshd && systemctl restart sshd
|
||||
systemctl enable chronyd && systemctl start chronyd
|
||||
systemctl enable firewalld && systemctl start firewalld
|
||||
firewall-cmd --permanent --add-service=ssh && firewall-cmd --reload
|
||||
systemctl enable auditd && systemctl start auditd
|
||||
systemctl enable dnf-automatic.timer && systemctl start dnf-automatic.timer
|
||||
setenforce 1
|
||||
growpart /dev/vda 1 && xfs_growfs /
|
||||
```
|
||||
|
||||
### SUSE Family Systems
|
||||
|
||||
**Package Manager:** zypper
|
||||
|
||||
**Cloud-Init Packages:**
|
||||
- sudo, vim, htop, tmux, curl, wget, rsync, git
|
||||
- python3, python3-pip, jq, bc
|
||||
- aide, audit, chrony, firewalld, lvm2
|
||||
- cloud-utils-growpart, gdisk
|
||||
|
||||
**Security Configuration:**
|
||||
- **Firewall:** firewalld enabled, SSH service allowed
|
||||
- **SSH:** Root login disabled, key-only auth
|
||||
- **Audit:** auditd enabled
|
||||
- **Time Sync:** chronyd configured
|
||||
|
||||
**User Management:**
|
||||
- ansible user → member of `wheel` group
|
||||
- Passwordless sudo access
|
||||
|
||||
**Post-Boot Commands:**
|
||||
```bash
|
||||
systemctl enable sshd && systemctl restart sshd
|
||||
systemctl enable chronyd && systemctl start chronyd
|
||||
systemctl enable firewalld && systemctl start firewalld
|
||||
firewall-cmd --permanent --add-service=ssh && firewall-cmd --reload
|
||||
systemctl enable auditd && systemctl start auditd
|
||||
growpart /dev/vda 1 && xfs_growfs / || resize2fs /dev/vda1 || btrfs filesystem resize max /
|
||||
```
|
||||
|
||||
## Security Features
|
||||
|
||||
### Universal Security Measures
|
||||
|
||||
All deployed VMs, regardless of distribution, include:
|
||||
|
||||
1. **User Security:**
|
||||
- Dedicated `ansible` service account
|
||||
- SSH key-based authentication only
|
||||
- Passwordless sudo (with logging)
|
||||
- Root SSH login disabled
|
||||
- Emergency console access available (password: ChangeMe123!)
|
||||
|
||||
2. **Network Security:**
|
||||
- Host-based firewall enabled and configured
|
||||
- SSH service allowed
|
||||
- Default deny policy for incoming traffic
|
||||
- Outgoing traffic allowed
|
||||
|
||||
3. **System Security:**
|
||||
- Audit daemon (auditd) enabled
|
||||
- Automatic security updates configured
|
||||
- Time synchronization enabled (chrony)
|
||||
- File integrity monitoring installed (AIDE)
|
||||
- Secure SSH configuration applied
|
||||
|
||||
4. **SSH Hardening:**
|
||||
```
|
||||
PermitRootLogin no
|
||||
PasswordAuthentication no
|
||||
PubkeyAuthentication yes
|
||||
MaxAuthTries 3
|
||||
MaxSessions 10
|
||||
ClientAliveInterval 300
|
||||
ClientAliveCountMax 2
|
||||
```
|
||||
|
||||
### Distribution-Specific Security
|
||||
|
||||
**RHEL Family:**
|
||||
- SELinux in enforcing mode
|
||||
- firewalld with rich rules support
|
||||
- dnf-automatic for security updates
|
||||
- Subscription management for certified packages (RHEL)
|
||||
|
||||
**Debian/Ubuntu:**
|
||||
- AppArmor profiles (Ubuntu)
|
||||
- UFW for simplified firewall management
|
||||
- unattended-upgrades for security updates
|
||||
- Automatic security patch installation
|
||||
|
||||
**SUSE Family:**
|
||||
- AppArmor support
|
||||
- firewalld with zones
|
||||
- YaST integration for security management
|
||||
|
||||
### Compliance Alignment
|
||||
|
||||
The deployment follows CLAUDE.md security principles:
|
||||
|
||||
✅ Principle of least privilege
|
||||
✅ Encryption in transit (SSH)
|
||||
✅ Key-based authentication
|
||||
✅ Automated security updates
|
||||
✅ System auditing enabled
|
||||
✅ Time synchronization
|
||||
✅ Firewall enabled by default
|
||||
✅ Regular security patching
|
||||
|
||||
## Post-Deployment
|
||||
|
||||
### Adding to Ansible Inventory
|
||||
|
||||
**Debian/Ubuntu VM:**
|
||||
```yaml
|
||||
debian_servers:
|
||||
hosts:
|
||||
debian12-vm:
|
||||
ansible_host: 192.168.122.X
|
||||
ansible_user: ansible
|
||||
ansible_ssh_common_args: '-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new'
|
||||
os_distribution: debian-12
|
||||
os_family: debian
|
||||
package_manager: apt
|
||||
```
|
||||
|
||||
**RHEL Family VM:**
|
||||
```yaml
|
||||
rhel_servers:
|
||||
hosts:
|
||||
rocky9-vm:
|
||||
ansible_host: 192.168.122.X
|
||||
ansible_user: ansible
|
||||
ansible_ssh_common_args: '-o ProxyJump=grokbox -o StrictHostKeyChecking=accept-new'
|
||||
os_distribution: rocky-9
|
||||
os_family: rhel
|
||||
package_manager: dnf
|
||||
selinux_mode: enforcing
|
||||
```
|
||||
|
||||
### Initial Configuration
|
||||
|
||||
After deployment, run configuration management:
|
||||
|
||||
```bash
|
||||
# Update system packages
|
||||
ansible <vm_name> -m package -a "name=* state=latest" -b
|
||||
|
||||
# Install additional packages
|
||||
ansible <vm_name> -m package -a "name=nginx state=present" -b
|
||||
|
||||
# Run configuration playbooks
|
||||
ansible-playbook -i inventories/development/hosts.yml \
|
||||
playbooks/configure-webserver.yml \
|
||||
-l <vm_name>
|
||||
```
|
||||
|
||||
### Verification Steps
|
||||
|
||||
1. **SSH Access:**
|
||||
```bash
|
||||
ssh -J grokbox ansible@<VM_IP>
|
||||
```
|
||||
|
||||
2. **Cloud-Init Status:**
|
||||
```bash
|
||||
cloud-init status --wait
|
||||
cloud-init status --long
|
||||
```
|
||||
|
||||
3. **System Information:**
|
||||
```bash
|
||||
cat /etc/os-release
|
||||
uname -r
|
||||
```
|
||||
|
||||
4. **Security Checks:**
|
||||
```bash
|
||||
# Firewall
|
||||
sudo ufw status verbose # Debian/Ubuntu
|
||||
sudo firewall-cmd --list-all # RHEL/SUSE
|
||||
|
||||
# SELinux (RHEL family)
|
||||
getenforce
|
||||
|
||||
# Audit daemon
|
||||
sudo systemctl status auditd
|
||||
|
||||
# Automatic updates
|
||||
sudo systemctl status unattended-upgrades # Debian/Ubuntu
|
||||
sudo systemctl status dnf-automatic.timer # RHEL
|
||||
```
|
||||
|
||||
5. **Disk and Memory:**
|
||||
```bash
|
||||
df -h
|
||||
free -h
|
||||
lsblk
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Distribution Selection Issues
|
||||
|
||||
**Problem:** Invalid distribution error
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# List supported distributions
|
||||
grep "^ " plays/deploy-linux-vm.yml | grep -E "debian-|ubuntu-|rhel-|centos-|rocky-|alma|sles-|opensuse"
|
||||
|
||||
# Use exact distribution identifier
|
||||
-e "os_distribution=debian-12" # Correct
|
||||
-e "os_distribution=debian" # Wrong
|
||||
```
|
||||
|
||||
### Cloud Image Download Failures
|
||||
|
||||
**Problem:** Image download fails or times out
|
||||
|
||||
**Causes:**
|
||||
- Network connectivity issues
|
||||
- Repository temporarily unavailable
|
||||
- Proxy configuration needed
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Test connectivity
|
||||
curl -I https://cloud.debian.org/
|
||||
curl -I https://cloud-images.ubuntu.com/
|
||||
|
||||
# Manual download
|
||||
cd /var/lib/libvirt/images
|
||||
wget <cloud_image_url>
|
||||
|
||||
# Configure proxy (if needed)
|
||||
export https_proxy=http://proxy:port
|
||||
```
|
||||
|
||||
### Checksum Verification Failures
|
||||
|
||||
**Problem:** Checksum verification fails
|
||||
|
||||
**Causes:**
|
||||
- Corrupt download
|
||||
- Mismatch between image and checksum file
|
||||
- Wrong checksum type
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Re-download image
|
||||
rm /var/lib/libvirt/images/<image-name>
|
||||
ansible-playbook plays/deploy-linux-vm.yml -e "os_distribution=..." -t download
|
||||
|
||||
# Verify manually
|
||||
cd /var/lib/libvirt/images
|
||||
sha256sum <image-name>
|
||||
# Compare with checksum file
|
||||
```
|
||||
|
||||
### VM Boot Issues
|
||||
|
||||
**Problem:** VM created but won't boot or get IP
|
||||
|
||||
**Causes:**
|
||||
- Cloud-init configuration error
|
||||
- Network misconfiguration
|
||||
- Insufficient resources
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Check VM status
|
||||
virsh list --all
|
||||
virsh dominfo <vm_name>
|
||||
|
||||
# View console
|
||||
virsh console <vm_name>
|
||||
|
||||
# Check cloud-init logs (via console)
|
||||
tail -f /var/log/cloud-init-output.log
|
||||
journalctl -u cloud-init
|
||||
|
||||
# Restart VM
|
||||
virsh destroy <vm_name>
|
||||
virsh start <vm_name>
|
||||
```
|
||||
|
||||
### SSH Connection Issues
|
||||
|
||||
**Problem:** Cannot SSH to deployed VM
|
||||
|
||||
**Causes:**
|
||||
- SSH key not configured correctly
|
||||
- Firewall blocking
|
||||
- cloud-init not completed
|
||||
- Wrong IP address
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Verify IP address
|
||||
virsh domifaddr <vm_name>
|
||||
|
||||
# Test connectivity
|
||||
ping <VM_IP>
|
||||
|
||||
# Check SSH service via console
|
||||
virsh console <vm_name>
|
||||
# Then: systemctl status ssh|sshd
|
||||
|
||||
# Verify firewall
|
||||
# Via console:
|
||||
sudo ufw status # Debian/Ubuntu
|
||||
sudo firewall-cmd --list-all # RHEL/SUSE
|
||||
|
||||
# Check cloud-init completion
|
||||
# Via console:
|
||||
cloud-init status --wait
|
||||
```
|
||||
|
||||
### SELinux Issues (RHEL Family)
|
||||
|
||||
**Problem:** Services failing due to SELinux denials
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Check SELinux status
|
||||
getenforce
|
||||
sestatus
|
||||
|
||||
# View denials
|
||||
sudo ausearch -m avc -ts recent
|
||||
|
||||
# Temporarily set to permissive (troubleshooting only)
|
||||
sudo setenforce 0
|
||||
|
||||
# Generate policy from denials
|
||||
sudo ausearch -m avc -ts recent | audit2allow -M myapp
|
||||
sudo semodule -i myapp.pp
|
||||
|
||||
# Re-enable enforcing
|
||||
sudo setenforce 1
|
||||
```
|
||||
|
||||
### Package Manager Issues
|
||||
|
||||
**Debian/Ubuntu:**
|
||||
```bash
|
||||
# Update package cache
|
||||
sudo apt update
|
||||
|
||||
# Fix broken packages
|
||||
sudo apt --fix-broken install
|
||||
|
||||
# Clear cache
|
||||
sudo apt clean
|
||||
```
|
||||
|
||||
**RHEL Family:**
|
||||
```bash
|
||||
# Update metadata
|
||||
sudo dnf makecache
|
||||
|
||||
# Check for problems
|
||||
sudo dnf check
|
||||
|
||||
# Clean cache
|
||||
sudo dnf clean all
|
||||
```
|
||||
|
||||
**SUSE:**
|
||||
```bash
|
||||
# Refresh repositories
|
||||
sudo zypper refresh
|
||||
|
||||
# Verify
|
||||
sudo zypper verify
|
||||
|
||||
# Clean cache
|
||||
sudo zypper clean
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Distribution Selection
|
||||
|
||||
1. **Use LTS versions for production:**
|
||||
- Ubuntu 22.04 LTS (support until 2027)
|
||||
- Ubuntu 24.04 LTS (support until 2029)
|
||||
- RHEL/Rocky/Alma 9 (support until 2032)
|
||||
|
||||
2. **Match distribution to workload:**
|
||||
- Web servers: Ubuntu, Debian
|
||||
- Enterprise applications: RHEL, Rocky Linux, AlmaLinux
|
||||
- Container hosts: CentOS Stream, Rocky Linux
|
||||
- Development: Ubuntu, Debian, openSUSE
|
||||
|
||||
3. **Consider support requirements:**
|
||||
- Commercial support: RHEL, SLES
|
||||
- Community support: CentOS Stream, Rocky Linux, AlmaLinux, Debian, Ubuntu, openSUSE
|
||||
|
||||
### Resource Allocation
|
||||
|
||||
**Minimum Requirements:**
|
||||
- 1 vCPU, 1GB RAM, 10GB disk (testing only)
|
||||
|
||||
**Recommended for Production:**
|
||||
- 2+ vCPUs, 2GB+ RAM, 20GB+ disk
|
||||
|
||||
**Workload-Specific:**
|
||||
```
|
||||
Web Server: 2-4 vCPUs, 4GB RAM, 40GB disk
|
||||
Database Server: 4-8 vCPUs, 16GB RAM, 100GB+ disk
|
||||
Application Server: 4-8 vCPUs, 8GB RAM, 80GB disk
|
||||
Container Host: 4-8 vCPUs, 16GB RAM, 80GB disk
|
||||
Development: 2-4 vCPUs, 8GB RAM, 50GB disk
|
||||
```
|
||||
|
||||
### Security Hardening
|
||||
|
||||
1. **Change default passwords immediately:**
|
||||
```bash
|
||||
sudo passwd root # Change from ChangeMe123!
|
||||
```
|
||||
|
||||
2. **Configure proper SSH keys:**
|
||||
- Use dedicated key per environment
|
||||
- Rotate keys regularly (90-180 days)
|
||||
- Use Ed25519 keys when possible
|
||||
|
||||
3. **Enable additional security features:**
|
||||
- CIS benchmarks scanning
|
||||
- Intrusion detection (fail2ban, OSSEC)
|
||||
- Log forwarding to SIEM
|
||||
- Vulnerability scanning
|
||||
|
||||
4. **Regular updates:**
|
||||
- Monitor automatic update logs
|
||||
- Schedule manual updates for major versions
|
||||
- Test updates in staging first
|
||||
|
||||
### Operational Excellence
|
||||
|
||||
1. **Naming Conventions:**
|
||||
- Use descriptive, meaningful VM names
|
||||
- Include purpose and environment: `web-prod-01`, `db-dev-01`
|
||||
- Document naming scheme
|
||||
|
||||
2. **Inventory Management:**
|
||||
- Keep Ansible inventory up-to-date
|
||||
- Document VM purpose and owner
|
||||
- Track VM lifecycle
|
||||
|
||||
3. **Monitoring:**
|
||||
- Set up monitoring for all VMs
|
||||
- Configure alerting for critical issues
|
||||
- Monitor resource usage trends
|
||||
|
||||
4. **Backup Strategy:**
|
||||
- Regular VM backups or disk snapshots
|
||||
- Test restore procedures
|
||||
- Document backup retention policy
|
||||
|
||||
5. **Documentation:**
|
||||
- Document VM purpose and configuration
|
||||
- Maintain runbooks for common tasks
|
||||
- Keep network diagrams current
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
1. **Disk I/O:**
|
||||
- Use virtio drivers (already configured)
|
||||
- Consider separate disk for databases
|
||||
- Use appropriate filesystem (xfs for RHEL, ext4 for Debian)
|
||||
|
||||
2. **Network:**
|
||||
- Use virtio network driver (already configured)
|
||||
- Consider SR-IOV for high-performance needs
|
||||
- Monitor network latency
|
||||
|
||||
3. **CPU:**
|
||||
- Right-size vCPU allocation
|
||||
- Avoid overcommitment on critical VMs
|
||||
- Use CPU pinning for performance-critical workloads
|
||||
|
||||
4. **Memory:**
|
||||
- Allocate sufficient RAM to avoid swapping
|
||||
- Monitor memory usage
|
||||
- Consider huge pages for databases
|
||||
|
||||
## References
|
||||
|
||||
- [CLAUDE.md](../CLAUDE.md) - Infrastructure guidelines
|
||||
- [Cheatsheet](../cheatsheets/deploy-linux-vm.md) - Quick reference
|
||||
- [Debian Cloud Images](https://cloud.debian.org/images/cloud/)
|
||||
- [Ubuntu Cloud Images](https://cloud-images.ubuntu.com/)
|
||||
- [CentOS Stream](https://www.centos.org/centos-stream/)
|
||||
- [Rocky Linux](https://rockylinux.org/)
|
||||
- [AlmaLinux](https://almalinux.org/)
|
||||
- [openSUSE](https://www.opensuse.org/)
|
||||
- [cloud-init Documentation](https://cloudinit.readthedocs.io/)
|
||||
- [libvirt Documentation](https://libvirt.org/docs.html)
|
||||
|
||||
---
|
||||
|
||||
**Document Version**: 1.0
|
||||
**Last Updated**: 2025-11-10
|
||||
**Maintained By**: Ansible Infrastructure Team
|
||||
Reference in New Issue
Block a user