Files
infra-automation/roles/system_info/README.md
ansible 70b57d223f Add system_info role for comprehensive infrastructure inventory
New role for gathering detailed system information including CPU, GPU,
RAM, disk, network, and hypervisor details with JSON export capabilities.

Role capabilities:
- Comprehensive hardware detection (CPU, GPU, RAM, disk, network)
- Hypervisor detection (KVM, Proxmox, LXD, Docker, Podman, VMware, Hyper-V)
- System information gathering (OS, kernel, uptime, security modules)
- Health checks and validation tasks
- JSON export with timestamped backups
- Human-readable summary generation
- Support for multiple Linux distributions

Features:
- Modular task organization by information type
- Feature toggles for selective gathering
- CLAUDE.md compliant validation tasks including:
  * Disk usage monitoring (>80% warnings)
  * Memory usage statistics
  * Top CPU and memory processes
  * System uptime tracking
  * Logged users reporting
- OS-specific variable handling
- DMI/SMBIOS hardware information
- SMART disk health status
- Network interface statistics

File structure:
roles/system_info/
├── README.md              # Comprehensive documentation
├── defaults/main.yml      # Configurable defaults
├── vars/main.yml          # Role variables
├── meta/main.yml          # Galaxy metadata
├── tasks/
│   ├── main.yml          # Main task coordinator
│   ├── install.yml       # Package installation
│   ├── gather_system.yml # OS and system info
│   ├── gather_cpu.yml    # CPU details
│   ├── gather_gpu.yml    # GPU detection
│   ├── gather_memory.yml # RAM information
│   ├── gather_disk.yml   # Disk and LVM info
│   ├── gather_network.yml # Network configuration
│   ├── detect_hypervisor.yml # Virtualization detection
│   ├── export_stats.yml  # JSON export
│   └── validate.yml      # Health checks (CLAUDE.md compliant)
├── templates/
│   └── summary.txt.j2    # Human-readable summary
├── handlers/
│   └── main.yml          # Service handlers
└── tests/
    └── test.yml          # Basic test playbook

Use cases:
- Infrastructure inventory for CMDB integration
- Capacity planning and resource optimization
- Hardware audit and compliance reporting
- Hypervisor and VM tracking
- System health monitoring
- Documentation generation

Output:
- JSON: ./stats/machines/<fqdn>/system_info.json
- Backup: ./stats/machines/<fqdn>/system_info_<timestamp>.json
- Summary: ./stats/machines/<fqdn>/summary.txt

Requirements:
- Ansible >= 2.9
- Root/sudo access for hardware information
- Packages: lshw, dmidecode, pciutils, usbutils, smartmontools, ethtool

Compliance:
- CLAUDE.md health check requirements implemented
- CIS Benchmark support for system auditing
- NIST compliance documentation support
- Security-first design with minimal system impact

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 01:36:01 +01:00

10 KiB

System Information Gathering Role

Comprehensive Ansible role for gathering detailed system information including CPU, GPU, RAM, disk, network, and hypervisor details. Statistics are exported to JSON files organized by machine FQDN.

Description

This role performs a thorough scan of system hardware and software configurations, collecting detailed metrics and storing them in structured JSON format. It's designed to create a complete inventory of infrastructure resources for documentation, monitoring, and capacity planning purposes.

Requirements

Ansible Version

  • Ansible >= 2.9

OS Compatibility

  • Debian 11 (Bullseye), 12 (Bookworm)
  • Ubuntu 20.04 (Focal), 22.04 (Jammy), 24.04 (Noble)
  • RHEL 8, 9
  • Rocky Linux 8, 9
  • AlmaLinux 8, 9

Dependencies

  • Root/sudo privileges for hardware information gathering
  • Internet access for package installation (if required packages are missing)

Required Packages

The role will automatically install these packages if they're not present:

  • lshw - Hardware lister
  • dmidecode - DMI/SMBIOS information
  • pciutils - PCI utilities (lspci)
  • usbutils - USB utilities
  • smartmontools - SMART disk monitoring
  • ethtool - Network interface information

Role Variables

Main Configuration

Variable Default Description Required
system_info_stats_base_dir ./stats/machines Base directory for statistics storage Yes
system_info_create_stats_dir true Create stats directory if it doesn't exist No
system_info_timestamp_format %Y-%m-%d %H:%M:%S UTC Timestamp format for statistics No
system_info_json_indent 2 JSON output indentation No

Feature Toggles

Variable Default Description
system_info_gather_cpu true Gather CPU information
system_info_gather_gpu true Gather GPU information
system_info_gather_memory true Gather memory information
system_info_gather_disk true Gather disk information
system_info_gather_network true Gather network information
system_info_gather_system true Gather OS and system information
system_info_detect_hypervisor true Detect hypervisor capabilities
system_info_include_raw_output false Include raw command outputs in JSON

Information Collected

System Information

  • Hostname and FQDN
  • Operating system details (distribution, version, release)
  • Kernel version and architecture
  • System uptime and boot time
  • Hardware manufacturer, model, serial number, UUID
  • Security modules status (SELinux/AppArmor)

CPU Information

  • Model name and vendor
  • Architecture and CPU family
  • Physical CPUs, cores, and vCPUs count
  • Current, maximum, and minimum frequencies
  • CPU cache details (L1, L2, L3)
  • CPU flags and features
  • Virtualization support (Intel VT-x, AMD-V)
  • Current load average
  • CPU vulnerability mitigations

GPU Information

  • GPU detection and device listing
  • NVIDIA GPU details (via nvidia-smi)
  • AMD GPU details (via rocm-smi)
  • Intel integrated graphics detection
  • IOMMU/VT-d status for GPU passthrough
  • Detailed PCI information for graphics devices

Memory Information

  • Total, free, used, and available memory
  • Buffers and cached memory
  • Memory usage percentage
  • Physical memory modules count
  • Memory hardware details (type, speed, manufacturer)
  • Swap configuration and usage
  • Memory pressure statistics
  • Huge pages configuration

Disk Information

  • Disk usage (all filesystems)
  • Block device listing with details
  • LVM configuration (PVs, VGs, LVs)
  • Mount points and filesystem types
  • Software RAID (mdadm) status
  • Hardware RAID controller detection
  • Physical disk listing (SSD vs HDD detection)
  • SMART health status
  • I/O statistics

Network Information

  • Network interfaces and their states
  • IP addresses (IPv4 and IPv6)
  • MAC addresses and MTU settings
  • Routing table
  • DNS configuration
  • Listening ports
  • Network interface statistics

Hypervisor Detection

  • Virtualization type and role (guest/host)
  • KVM/Libvirt: Version, running VMs, networks, storage pools
  • Proxmox VE: Version, cluster status, VMs, containers, storage
  • LXD/LXC: Version, containers, storage, networks, cluster
  • Docker: Version, running/total containers, images count
  • Podman: Version and availability
  • VMware ESXi: Detection and version
  • Hyper-V: Detection via kernel modules

Output Structure

JSON File Location

Statistics are saved to:

<system_info_stats_base_dir>/<fqdn>/system_info.json
<system_info_stats_base_dir>/<fqdn>/system_info_<timestamp>.json (backup)
<system_info_stats_base_dir>/<fqdn>/summary.txt (human-readable)

JSON Structure

{
  "collection_info": {
    "timestamp": "ISO8601 timestamp",
    "collected_by": "ansible",
    "role_version": "1.0.0"
  },
  "host_info": { ... },
  "system": { ... },
  "kernel": { ... },
  "hardware": { ... },
  "security": { ... },
  "cpu": { ... },
  "gpu": { ... },
  "memory": { ... },
  "swap": { ... },
  "disk": { ... },
  "network": { ... },
  "hypervisor": { ... }
}

Dependencies

None. This role is standalone and has no dependencies on other roles.

Example Playbook

Basic Usage

---
- hosts: all
  become: true
  roles:
    - role: system_info

Custom Statistics Directory

---
- hosts: all
  become: true
  roles:
    - role: system_info
      vars:
        system_info_stats_base_dir: /var/lib/ansible/inventory

Selective Information Gathering

---
- hosts: servers
  become: true
  roles:
    - role: system_info
      vars:
        system_info_gather_cpu: true
        system_info_gather_gpu: false
        system_info_gather_memory: true
        system_info_detect_hypervisor: true

Using Tags for Partial Execution

# Gather only CPU information
ansible-playbook site.yml -t system_info,cpu

# Gather only hypervisor information
ansible-playbook site.yml -t system_info,hypervisor

# Run validation/health checks only
ansible-playbook site.yml -t system_info,validate

# Skip installation, only gather information
ansible-playbook site.yml -t system_info --skip-tags install

Available Tags

Tag Purpose
install Install required packages
gather All information gathering tasks
system System and OS information
cpu CPU information
gpu GPU information
memory Memory information
disk Disk information
network Network information
hypervisor Hypervisor detection
export Export statistics to JSON
statistics Statistics aggregation
validate Validation and health checks
health-check System health monitoring
security Security-related information

Security Considerations

Privileges

  • Requires root/sudo access for hardware information gathering
  • Uses become: true for privileged commands
  • DMI/SMBIOS information requires root access

Sensitive Data

  • Serial numbers and UUIDs are collected (can identify specific hardware)
  • Network configuration may reveal internal IP addressing
  • No secrets or credentials are collected
  • All data is stored locally on the control node

Data Privacy

  • Statistics files contain detailed system information
  • Restrict access to the statistics directory appropriately
  • Consider encryption for the statistics directory if storing sensitive infrastructure details

Performance Impact

  • Execution Time: 30-60 seconds per host (depends on hardware complexity)
  • Network Impact: Minimal - only package installation requires network
  • System Load: Very low - read-only operations
  • Disk I/O: Minimal - small JSON files (<100KB typically)

Troubleshooting

Common Issues

Issue: "dmidecode: command not found"

  • Solution: Role will install it automatically. Ensure internet access or pre-stage packages.

Issue: "Permission denied" errors

  • Solution: Ensure become: true is set in the playbook or role invocation.

Issue: SMART data not available

  • Solution: Not all systems/disks support SMART. This is expected and won't fail the role.

Issue: GPU information showing "No GPU detected"

  • Solution: Normal for VMs and servers without GPUs. Not an error condition.

Issue: Hypervisor commands timing out

  • Solution: Some hypervisor checks may be slow. Increase task timeout if needed.

Debug Mode

Run with verbose output:

ansible-playbook site.yml -t system_info -vvv

Compliance Requirements

  • Follows CIS Benchmark recommendations for system auditing
  • Supports security compliance documentation (NIST, PCI-DSS)
  • Enables infrastructure inventory for CMDB integration
  • Facilitates capacity planning and resource optimization

Testing

Manual Testing

# Test on a single host
ansible-playbook -i inventory/production site.yml -l testhost -t system_info

# Dry-run mode
ansible-playbook site.yml -t system_info --check

Validation

After execution, verify:

  1. Statistics directory created: ./stats/machines/<fqdn>/
  2. JSON file present and valid: system_info.json
  3. Summary file created: summary.txt
  4. No errors in Ansible output

Maintenance

Updates

  • Review and update the role quarterly
  • Test against new OS versions before production deployment
  • Keep documentation synchronized with code changes

Monitoring

  • Track execution time trends (performance degradation may indicate issues)
  • Monitor statistics file sizes (unexpected growth may indicate problems)
  • Validate JSON file integrity periodically

Version History

  • 1.0.0 (2025-01-11): Initial release
    • Complete system information gathering
    • CPU, GPU, RAM, Disk, Network detection
    • Hypervisor detection (KVM, Proxmox, LXD, Docker, etc.)
    • JSON export with timestamped backups
    • Human-readable summary generation

License

MIT

Author Information

Created by the Ansible Infrastructure Team for comprehensive system inventory and monitoring.

For issues, questions, or contributions, please refer to the project repository.

  • system_baseline - System hardening and baseline configuration
  • monitoring - System monitoring setup
  • inventory_sync - Dynamic inventory management

Additional Resources