- Add comprehensive Docker user namespace testing documentation - Add Docker configuration rollback runbook for disaster recovery - Add VM snapshot backup playbook for system protection 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
18 KiB
Docker User Namespace Remapping - Testing and Implementation Guide
Document Version: 1.0 Last Updated: 2025-11-11 Risk Level: HIGH Testing Required: YES (Mandatory in dev/test first)
Table of Contents
- Overview
- Security Benefits
- Prerequisites
- Testing Phase (Week 48-49)
- Production Implementation (Week 50)
- Mailcow-Specific Considerations
- Troubleshooting
Overview
User namespace remapping is a Docker security feature that maps container UID/GIDs to different values on the host, preventing container root from being host root.
Current Status
| Host | User Namespaces | Risk Level | Implementation Priority |
|---|---|---|---|
| pihole | Not configured | MEDIUM | Week 49 (after testing) |
| mymx | Not configured | HIGH | Week 50 (mailcow complexity) |
Impact Assessment
Benefits:
- ✅ Container root ≠ host root (major security improvement)
- ✅ Reduces container escape impact
- ✅ CIS Docker Benchmark compliance (2.13)
Risks:
- ⚠️ ALL containers must be recreated
- ⚠️ Volume permissions must be remapped
- ⚠️ Breaking change for existing deployments
- ⚠️ Mailcow may have specific requirements
Recommendation: Test thoroughly in dev, then pihole, then mymx (last)
Security Benefits
Without User Namespace Remapping (Current State)
Container: Host:
UID 0 (root) → UID 0 (root) ❌ DANGEROUS
UID 1000 → UID 1000
Problem: Container root can potentially escape and has host root privileges.
With User Namespace Remapping (Target State)
Container: Host:
UID 0 (root) → UID 165536 ✅ SAFE
UID 1000 → UID 166536
Benefit: Container root is unprivileged user on host.
Prerequisites
Before Starting Testing
-
VM Snapshots Created
ansible-playbook playbooks/backup_vm_snapshot.yml \ -e "target_vms=['pihole', 'mymx']" -
Rollback Procedures Reviewed
- Read:
docs/runbooks/docker-configuration-rollback.md - Understand VM snapshot restore process
- Have emergency contact information ready
- Read:
-
Maintenance Window Scheduled
- Duration: 2-3 hours for testing
- Low-traffic period recommended
- Second person available for verification
-
Documentation Ready
- This guide printed or accessible offline
- Docker and mailcow documentation available
- Notepad for documenting issues
Testing Phase (Week 48-49)
Phase 1: Test Environment Setup (Week 48)
Objective: Validate user namespace remapping with simple container
Option A: Use derp VM (Recommended)
# 1. Start derp VM (if stopped)
ssh grokbox "sudo virsh start derp"
# 2. Create ansible user and configure SSH
# (Use deploy_linux_vm role or manual setup)
# 3. Install Docker
ansible derp -m apt -a "name=docker.io state=present" -b
# 4. Create snapshot before testing
ansible-playbook playbooks/backup_vm_snapshot.yml \
-e "target_vms=['derp']"
Option B: Create temporary test container on existing host
# On pihole (low risk - only 1 container)
# Create test container first
docker run -d --name userns-test \
-v test-volume:/data \
alpine:latest sleep infinity
Phase 2: Enable User Namespace Remapping (Week 48)
Step 1: Configure Docker Daemon
# On test host (derp or pihole)
sudo tee /etc/docker/daemon.json <<EOF
{
"userns-remap": "default"
}
EOF
# Validate syntax
cat /etc/docker/daemon.json | jq '.'
Step 2: Restart Docker
# Stop all containers first
docker stop $(docker ps -q)
# Restart Docker daemon
sudo systemctl restart docker
# Verify it started
sudo systemctl status docker
# Check for user namespace in docker info
docker info | grep -i "userns"
# Should show: "userns": true
Step 3: Verify UID Mapping
# Check subuid/subgid configuration
cat /etc/subuid
cat /etc/subgid
# Should show something like:
# dockremap:165536:65536
# Verify Docker is using remapping
docker info --format '{{.SecurityOptions}}'
Step 4: Recreate Test Container
# Remove old container (data is in volume)
docker rm userns-test
# Recreate container
docker run -d --name userns-test \
-v test-volume:/data \
alpine:latest sleep infinity
# Verify it's running
docker ps | grep userns-test
Step 5: Test Volume Permissions
# Create test file in container
docker exec userns-test sh -c 'echo "test" > /data/test.txt'
# Check file ownership on host
# Volume location changed! It's now in:
sudo ls -la /var/lib/docker/165536.165536/volumes/test-volume/_data/
# UID should be 165536 (remapped root)
# Test read/write in container
docker exec userns-test cat /data/test.txt
docker exec userns-test sh -c 'echo "test2" >> /data/test.txt'
Phase 3: Test with Real Application (Week 48-49)
Test Scenario 1: Simple Web Server (pihole preparation)
# Deploy nginx with volume
docker run -d --name test-nginx \
-p 8080:80 \
-v nginx-data:/usr/share/nginx/html \
nginx:alpine
# Test access
curl http://localhost:8080
# Create content
docker exec test-nginx sh -c 'echo "<h1>User Namespace Test</h1>" > /usr/share/nginx/html/test.html'
# Verify access
curl http://localhost:8080/test.html
# Check logs
docker logs test-nginx
Test Scenario 2: Database Container (mailcow preparation)
# Deploy MariaDB with volume
docker run -d --name test-db \
-e MYSQL_ROOT_PASSWORD=testpass123 \
-v mysql-data:/var/lib/mysql \
mariadb:10.11
# Wait for startup
sleep 30
# Test database
docker exec test-db mysql -ptest pass123 -e "SHOW DATABASES;"
# Create test database
docker exec test-db mysql -ptest pass123 -e "CREATE DATABASE testdb;"
# Stop and restart to test persistence
docker stop test-db
docker start test-db
sleep 20
# Verify data persisted
docker exec test-db mysql -ptest pass123 -e "SHOW DATABASES;" | grep testdb
Test Scenario 3: Application with File Uploads
# Create upload directory
mkdir -p /tmp/test-uploads
# Run container with bind mount
docker run -d --name test-upload \
-v /tmp/test-uploads:/uploads \
alpine:latest sleep infinity
# Test file creation
docker exec test-upload sh -c 'echo "test" > /uploads/test.txt'
# Check host permissions
ls -la /tmp/test-uploads/
# File should be owned by UID 165536
# Test file access from container
docker exec test-upload cat /uploads/test.txt
Phase 4: Identify Issues (Week 48-49)
Common Issues to Check
-
Permission Denied Errors
# Check container logs docker logs <container_name> 2>&1 | grep -i "permission" -
Volume Mount Failures
# List volumes docker volume ls # Inspect volume docker volume inspect <volume_name> # Check actual location on disk sudo ls -la /var/lib/docker/*/volumes/ -
Bind Mount Issues
# For bind mounts, may need to adjust host permissions # Example: Allow remapped UID to write sudo chown 165536:165536 /path/to/host/dir -
Privileged Container Conflicts
# Test if privileged containers still work docker run --rm --privileged alpine:latest id # Note: Privileged containers bypass userns remapping
Document All Findings
Create test log:
## User Namespace Remapping Test Log
Date: <date>
Host: <hostname>
Docker Version: <version>
### Test 1: Simple Container
- Result: PASS/FAIL
- Issues: <none or list>
- Notes: <observations>
### Test 2: Web Server
- Result: PASS/FAIL
- Issues: <none or list>
- Notes: <observations>
### Test 3: Database
- Result: PASS/FAIL
- Issues: <none or list>
- Notes: <observations>
### Conclusion
Ready for production: YES/NO
Blockers: <list if any>
Production Implementation (Week 50)
Implementation Order
- pihole (Week 49 end / Week 50 start) - Lowest risk
- mymx (Week 50 end) - Highest risk, requires mailcow-specific testing
pihole Implementation
Prerequisites:
- ✅ Testing completed successfully on derp/test environment
- ✅ VM snapshot created
- ✅ Maintenance window scheduled
- ✅ Rollback procedure reviewed
Steps:
# 1. Create snapshot
ansible-playbook playbooks/backup_vm_snapshot.yml \
-e "target_vms=['pihole']" \
-e "snapshot_description='Pre user namespace implementation'"
# 2. Backup current configuration
ansible pihole -m shell -a "sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.backup.$(date +%s)" -b
# 3. Stop pihole container
ansible pihole -m shell -a "docker stop pihole" -b
# 4. Configure user namespace remapping
ansible pihole -m copy -b -a "
dest=/etc/docker/daemon.json
content='{\"userns-remap\": \"default\"}'
owner=root
group=root
mode='0644'
"
# 5. Restart Docker
ansible pihole -m systemd -a "name=docker state=restarted" -b
# 6. Verify Docker started
ansible pihole -m shell -a "docker info | grep -i userns" -b
# 7. Recreate pihole container (adjust based on actual deployment)
# If using docker run command, re-run it
# If using docker-compose, run: docker-compose up -d
# 8. Verify pihole is working
ansible pihole -m shell -a "docker ps" -b
ansible pihole -m shell -a "docker logs pihole --tail 50" -b
# 9. Test DNS functionality
dig @192.168.122.12 google.com
# 10. Monitor for 1 hour
watch -n 60 'ansible pihole -m shell -a "docker ps" -b'
Rollback if Issues:
# Follow docs/runbooks/docker-configuration-rollback.md
# Procedure 3: User Namespace Remapping Rollback
Mailcow-Specific Considerations
Why Mailcow is Complex
- Multiple interconnected containers (24 containers)
- Persistent data in multiple volumes (mail, databases, configs)
- File permissions critical for mail delivery
- Active production service - downtime impact high
Mailcow Testing Approach (Week 49-50)
Phase 1: Research (Week 49)
# 1. Check mailcow documentation
# Search: "user namespace" or "userns-remap"
# URL: https://docs.mailcow.email/
# 2. Check mailcow GitHub issues
# Search for: userns, user namespace, permission issues
# 3. Check mailcow community forum
# URL: https://community.mailcow.email/
# Search for similar implementations
Phase 2: Mailcow Test Environment (Week 49)
Option A: Deploy test mailcow on derp
# Requires:
# - 4GB+ RAM (derp may be too small)
# - 20GB+ disk space
# - Domain for testing
# Install mailcow on derp
git clone https://github.com/mailcow/mailcow-dockerized
cd mailcow-dockerized
./generate_config.sh
docker-compose up -d
Option B: Clone mymx mailcow config to test environment
# Create test VM clone
# Copy mailcow configuration
# Test with user namespaces
Phase 3: Mailcow Volume Analysis (Week 49)
# On mymx, identify all volumes
docker volume ls | grep mailcow
# Check critical volumes
docker volume inspect mailcowdockerized_vmail-vol-1
docker volume inspect mailcowdockerized_mysql-vol-1
# Document current permissions
for vol in $(docker volume ls -q | grep mailcow); do
echo "=== $vol ==="
sudo ls -la /var/lib/docker/volumes/$vol/_data/ | head -20
done > /tmp/mailcow-permissions-before.txt
Phase 4: Mailcow Implementation (Week 50 - IF testing successful)
ONLY proceed if:
- ✅ Testing in dev environment successful
- ✅ pihole implementation successful
- ✅ Mailcow community confirms no known issues
- ✅ Extended maintenance window available (2-4 hours)
- ✅ Full backups completed
- ✅ Rollback tested and confirmed working
Implementation Steps:
# 1. Create snapshot
ansible-playbook playbooks/backup_vm_snapshot.yml \
-e "target_vms=['mymx']" \
-e "snapshot_description='Pre mailcow user namespace'"
# 2. Backup ALL mailcow data
ansible mymx -m shell -a "cd /opt/mailcow-dockerized && ./helper-scripts/backup_and_restore.sh backup all" -b
# 3. Stop mailcow
ansible mymx -m shell -a "cd /opt/mailcow-dockerized && docker-compose down" -b
# 4. Backup current state
ansible mymx -m shell -a "
sudo tar -czf /root/mailcow-pre-userns-$(date +%s).tar.gz \
/etc/docker \
/opt/mailcow-dockerized \
/var/lib/docker/volumes/mailcow*
" -b
# 5. Configure user namespace
ansible mymx -m shell -a "
sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.backup.$(date +%s)
echo '{\"userns-remap\": \"default\"}' | sudo tee /etc/docker/daemon.json
" -b
# 6. Restart Docker
ansible mymx -m systemd -a "name=docker state=restarted" -b
# 7. Verify Docker started with user namespaces
ansible mymx -m shell -a "docker info | grep -i userns" -b
# 8. Start mailcow (will recreate all containers)
ansible mymx -m shell -a "cd /opt/mailcow-dockerized && docker-compose up -d" -b
# 9. Monitor startup
watch -n 10 'ansible mymx -m shell -a "cd /opt/mailcow-dockerized && docker-compose ps" -b'
# 10. Check logs for permission errors
ansible mymx -m shell -a "cd /opt/mailcow-dockerized && docker-compose logs --tail 100" -b | grep -i "permission\|denied\|failed"
# 11. Test mail functionality
# - Send test email
# - Receive test email
# - Check webmail access
# - Verify SOGo groupware
# - Test IMAP/SMTP connections
# 12. Monitor for 4-8 hours before declaring success
Known Potential Issues with Mailcow:
-
Vmail Volume Permissions
# If mail delivery fails with permission errors # May need to adjust permissions (LAST RESORT) sudo chown -R 165536:165536 /var/lib/docker/165536.165536/volumes/mailcowdockerized_vmail-vol-1/_data/ -
MySQL Volume Issues
# If database won't start # Check MySQL logs docker logs mailcowdockerized-mysql-mailcow-1 # May need database permission fixes # This is why testing is CRITICAL -
Dovecot Permission Issues
# Dovecot is sensitive to mail file permissions # May require config adjustments in mailcow.conf
Mailcow Rollback Decision Point
Roll back immediately if:
- Docker daemon won't start
- MySQL container won't start
- Cannot send/receive mail after 15 minutes
- Permission errors in critical containers
- Data appears missing/inaccessible
Use VM snapshot restore if:
- Multiple containers failing
- Data corruption suspected
- Cannot resolve within 30 minutes
Troubleshooting
Issue 1: Docker Daemon Won't Start
Symptoms:
systemctl status docker
# Failed to start Docker Application Container Engine
Solutions:
# Check logs
journalctl -u docker -n 100 --no-pager
# Common causes:
# 1. Invalid daemon.json syntax
cat /etc/docker/daemon.json | jq '.'
# 2. Subuid/subgid not configured
cat /etc/subuid
cat /etc/subgid
# Should have dockremap:165536:65536
# 3. Restore backup
sudo cp /etc/docker/daemon.json.backup.<timestamp> /etc/docker/daemon.json
sudo systemctl start docker
Issue 2: Container Won't Start - Permission Denied
Symptoms:
docker logs <container>
# Permission denied errors
Solutions:
# 1. Check volume location
docker volume inspect <volume_name>
# 2. Check permissions on host
sudo ls -la /var/lib/docker/165536.165536/volumes/<volume>/_data/
# 3. If permissions wrong, may need to adjust
# (Avoid this if possible - indicates larger problem)
sudo chown -R 165536:165536 /var/lib/docker/165536.165536/volumes/<volume>/_data/
Issue 3: Bind Mounts Not Working
Symptoms:
docker logs <container>
# Cannot access /bind/mount/path
Solutions:
# Bind mounts need host directory permissions adjusted
sudo chown 165536:165536 /path/to/bind/mount
# Or use volumes instead of bind mounts
# Volumes are handled automatically by Docker
Issue 4: Privileged Container Needed
Note: Privileged containers (like mailcow netfilter) bypass user namespace remapping.
# Verify privileged container still works
docker inspect <container> | grep -i privileged
# Should show: "Privileged": true
# Privileged containers run as actual root (userns bypassed)
# This is expected for netfilter, acceptable risk (documented)
Success Criteria
Testing Phase Success (Before Production)
- Simple container runs successfully
- Web server container accessible
- Database container stores/retrieves data
- Volume permissions correct (165536 UID)
- Bind mounts work (if needed)
- No permission errors in logs
- Can recreate containers after Docker restart
- Rollback procedure tested and successful
Production Implementation Success
pihole
- VM snapshot created
- Docker daemon running with user namespaces
- pihole container running
- DNS queries working
- No permission errors in logs
- Monitoring shows normal operation for 24+ hours
mymx/mailcow
- VM snapshot created
- Docker daemon running with user namespaces
- All 24 containers running
- Can send email
- Can receive email
- Webmail accessible
- SOGo groupware working
- No permission errors in logs
- Monitoring shows normal operation for 48+ hours
- Full service verification completed
Decision Tree
START: Ready to enable user namespaces?
│
├─ Testing completed in dev?
│ ├─ NO → STOP: Complete testing first
│ └─ YES → Continue
│
├─ VM snapshots created?
│ ├─ NO → STOP: Create snapshots first
│ └─ YES → Continue
│
├─ Rollback procedure reviewed?
│ ├─ NO → STOP: Review rollback docs
│ └─ YES → Continue
│
├─ Which host?
│ ├─ pihole → Proceed (lower risk)
│ └─ mymx → Additional checks needed
│ │
│ ├─ Mailcow community research done?
│ │ ├─ NO → STOP: Research first
│ │ └─ YES → Continue
│ │
│ ├─ pihole implementation successful?
│ │ ├─ NO → STOP: Fix pihole first
│ │ └─ YES → Continue
│ │
│ ├─ Extended maintenance window?
│ │ ├─ NO → STOP: Schedule proper window
│ │ └─ YES → Proceed with caution
│ │
│ └─ Proceed with mymx (high risk)
References
- Docker User Namespace Documentation: https://docs.docker.com/engine/security/userns-remap/
- CIS Docker Benchmark 2.13: Enable user namespace support
- Mailcow Documentation: https://docs.mailcow.email/
- NIST SP 800-190: Section 4.4 - Host OS and multi-tenancy
Document Version: 1.0 Next Review: After testing completion (Week 49) Owner: Infrastructure Security Team