infra-automation/docs/architecture/security-model.md

# Security Model

## Security Architecture Overview

This document describes the security architecture, controls, and practices implemented across the Ansible-managed infrastructure.

## Security Principles

### Defense in Depth
Multiple layers of security controls protect infrastructure:
1. **Network Security**: Firewalls, network segmentation
2. **Access Control**: SSH keys, least privilege, MFA (planned)
3. **System Hardening**: SELinux/AppArmor, secure configurations
4. **Patch Management**: Automatic security updates
5. **Audit & Logging**: Comprehensive activity tracking
6. **Encryption**: Data at rest and in transit

### Least Privilege
- Service accounts with minimal required permissions
- No root SSH access
- Sudo logging enabled
- Regular access reviews

### Security by Default
- SSH password authentication disabled
- Firewall enabled by default
- SELinux/AppArmor enforcing mode
- Automatic security updates enabled
- Audit daemon (auditd) active

## Access Control

### Authentication

**SSH Key-Based Authentication**:
- RSA 4096-bit or Ed25519 keys
- No password-based SSH login
- Key rotation every 90-180 days
- Root login disabled

**Service Accounts**:
- `ansible` user on all managed systems
- Passwordless sudo with logging
- SSH public keys pre-deployed
- No interactive shell access

### Authorization

**Sudo Configuration** (`/etc/sudoers.d/ansible`):
```
ansible ALL=(ALL) NOPASSWD: ALL
Defaults:ansible !requiretty
Defaults:ansible log_output
```

**Future Enhancements**:
- RBAC via Ansible Tower/AWX
- Multi-factor authentication (MFA)
- Privileged access management (PAM)

## Network Security

### Firewall Configuration

**Debian/Ubuntu (UFW)**:
```bash
# Default policies
ufw default deny incoming
ufw default allow outgoing

# Allow SSH
ufw allow 22/tcp

# Application-specific rules added per VM
```

**RHEL/AlmaLinux (firewalld)**:
```bash
# Default zone: drop
firewall-cmd --set-default-zone=drop

# Allow SSH in public zone
firewall-cmd --zone=public --add-service=ssh --permanent
```

### Network Segmentation

| Zone | Purpose | Access Control |
|------|---------|---------------|
| Management | Ansible control, tooling | Restricted to ops team |
| Hypervisor | KVM hosts | Ansible control node only |
| Production VMs | Live services | Application-specific rules |
| Staging VMs | Testing | More permissive for testing |
| Development VMs | Dev/test | Minimal restrictions |

### SSH Hardening

**Configuration** (`/etc/ssh/sshd_config.d/99-security.conf`):
```ini
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
GSSAPIAuthentication no        # Explicitly disabled per CLAUDE.md
MaxAuthTries 3
ClientAliveInterval 300
ClientAliveCountMax 2
X11Forwarding no
Protocol 2
```

## System Hardening

### Mandatory Access Control

**RHEL Family (SELinux)**:
- Mode: `enforcing`
- Policy: `targeted`
- Verification: `getenforce`
- No setenforce 0 in production

**Debian Family (AppArmor)**:
- Status: `enabled`
- Mode: `enforce`
- Profiles: All default profiles active

### File System Security

**LVM Mount Options** (CLAUDE.md compliant):
- `/tmp`: mounted with `noexec,nosuid,nodev`
- `/var/tmp`: mounted with `noexec,nosuid,nodev`
- Separate partitions for `/var`, `/var/log`, `/var/log/audit`

### Kernel Hardening

**sysctl parameters** (`/etc/sysctl.d/99-security.conf`):
```ini
# Network security
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0

# Security hardening
kernel.dmesg_restrict = 1
kernel.kptr_restrict = 2
```

## Patch Management

### Automatic Security Updates

**Debian/Ubuntu (unattended-upgrades)**:
- Security updates: Automatically installed
- Reboot: Manual (not automatic)
- Notifications: Email on errors

**RHEL/AlmaLinux (dnf-automatic)**:
- Security updates: Automatically applied
- Reboot: Manual (not automatic)
- Logging: All actions logged

### Update Strategy

| Environment | Update Schedule | Testing | Rollback Plan |
|-------------|----------------|---------|---------------|
| Development | Immediate | Minimal | Redeploy if issues |
| Staging | Weekly | Full regression | Snapshot restore |
| Production | Monthly (security: weekly) | Comprehensive | Snapshot + DR plan |

## Secrets Management

### Current: Ansible Vault

**Encrypted Content**:
- SSH private keys
- Service account passwords
- API tokens
- Database credentials

**Location**: `./secrets` directory (private git repository)

**Key Rotation**: Every 90 days

### Future: External Secrets Manager

**Planned Integration**:
- HashiCorp Vault
- AWS Secrets Manager
- Azure Key Vault

**Benefits**:
- Centralized secrets management
- Dynamic secret generation
- Audit trail for secret access
- Automated rotation

## Audit & Logging

### Audit Daemon (auditd)

**Enabled on All Systems**:
- Monitors privileged operations
- Logs file access events
- Tracks authentication attempts
- Immutable log files

**Key Rules**:
- Monitor `/etc/sudoers` changes
- Track user account modifications
- Log privileged command execution
- Monitor sensitive file access

### Log Management

**Local Logging**:
- `/var/log/audit/audit.log` (auditd)
- `/var/log/auth.log` (authentication - Debian)
- `/var/log/secure` (authentication - RHEL)
- `journalctl` (systemd)

**Retention**: 30 days local

**Future**: Centralized logging (ELK, Graylog, or Loki)

### Ansible Execution Logging

All Ansible playbook executions are logged:
- Command executed
- User who executed
- Target hosts
- Timestamp
- Results and changes

## Compliance & Standards

### CIS Benchmarks

| Control Area | Implementation | CIS Reference |
|-------------|----------------|---------------|
| SSH Hardening | ✓ Implemented | 5.2.x |
| Firewall | ✓ Enabled | 3.5.x |
| Audit Logging | ✓ Active | 4.1.x |
| File Permissions | ✓ Configured | 1.x |
| User Accounts | ✓ Managed | 5.x |
| SELinux/AppArmor | ✓ Enforcing | 1.6.x |

### NIST Cybersecurity Framework

| Function | Controls | Status |
|----------|----------|--------|
| Identify | Asset inventory (system_info role) | ✓ |
| Protect | Access control, encryption | ✓ |
| Detect | Audit logging, monitoring (planned) | Partial |
| Respond | Incident response playbooks | Planned |
| Recover | DR procedures, backups | Partial |

## Incident Response

### Security Incident Workflow

```
1. Detection
   └─▶ Audit logs, monitoring alerts

2. Containment
   └─▶ Isolate affected systems (firewall rules)
   └─▶ Disable compromised accounts

3. Investigation
   └─▶ Review audit logs
   └─▶ Analyze system state
   └─▶ Identify root cause

4. Eradication
   └─▶ Remove malware/backdoors
   └─▶ Patch vulnerabilities
   └─▶ Restore from clean backups

5. Recovery
   └─▶ Restore services
   └─▶ Verify security posture
   └─▶ Monitor for re-infection

6. Lessons Learned
   └─▶ Document incident
   └─▶ Update playbooks
   └─▶ Improve defenses
```

### Emergency Contacts

- **Security Team**: security@example.com
- **On-Call**: +1-XXX-XXX-XXXX
- **Escalation**: CTO/CISO

## Security Testing

### Regular Activities

**Weekly**:
- Review audit logs
- Check for security updates
- Validate firewall rules

**Monthly**:
- Run system_info for inventory
- Review user access
- Test backup restore

**Quarterly**:
- Vulnerability scanning
- Configuration audits
- DR testing
- Access reviews

### Tools

- **Lynis**: System auditing
- **OpenSCAP**: Compliance scanning
- **ansible-lint**: Playbook security checks
- **AIDE**: File integrity monitoring

## Security Hardening Checklist

### Per-System Checklist

- [ ] SSH hardening applied
- [ ] Firewall configured and enabled
- [ ] SELinux/AppArmor enforcing
- [ ] Automatic security updates enabled
- [ ] Audit daemon running
- [ ] Time synchronization configured
- [ ] LVM with secure mount options
- [ ] Unnecessary services disabled
- [ ] Security packages installed (aide, fail2ban)
- [ ] Root login disabled
- [ ] Service account configured
- [ ] Logs being collected

## Related Documentation

- [Architecture Overview](./overview.md)
- [Network Topology](./network-topology.md)
- [Security Compliance](../security-compliance.md)
- [CLAUDE.md Guidelines](../../CLAUDE.md)

---

**Document Version**: 1.0.0
**Last Updated**: 2025-11-11
**Review Schedule**: Quarterly
**Document Owner**: Security & Infrastructure Team