add project structure files

This commit is contained in:
Username
2025-12-20 03:31:37 +01:00
parent 202e927918
commit 4cb29fa3d2
4 changed files with 360 additions and 0 deletions

104
PROJECT.md Normal file
View File

@@ -0,0 +1,104 @@
# FlaskPaste
## Purpose
FlaskPaste is a lightweight, security-hardened pastebin REST API for self-hosted deployments. It provides a minimal, dependency-light alternative to public pastebin services, designed for environments where data privacy, authentication control, and operational simplicity are priorities.
## Problem Statement
Public pastebin services present risks:
- Data sovereignty concerns (content stored on third-party infrastructure)
- Limited authentication options
- No control over retention policies
- Abuse/spam from other users affecting service reliability
- Feature bloat and complex UIs when only an API is needed
## Solution
A self-hosted pastebin API that:
- Stores pastes locally in SQLite
- Supports client certificate authentication via reverse proxy
- Automatically expires content based on access patterns
- Prevents abuse through content-hash deduplication
- Serves text and binary content with proper MIME detection
- Runs behind any reverse proxy (nginx, HAProxy, Caddy)
## Success Criteria
```
┌────────────────────────────────┬────────────────────────────────────────────┐
│ Criterion │ Metric
├────────────────────────────────┼────────────────────────────────────────────┤
│ Security │ Zero injection vulnerabilities
│ │ All OWASP headers implemented
│ │ Input validation on all endpoints
├────────────────────────────────┼────────────────────────────────────────────┤
│ Reliability │ SQLite ACID guarantees
│ │ Graceful degradation on errors
│ │ Health check endpoint for monitoring
├────────────────────────────────┼────────────────────────────────────────────┤
│ Simplicity │ Single dependency (Flask)
│ │ SQLite for storage (no external DB)
│ │ Environment-based configuration
├────────────────────────────────┼────────────────────────────────────────────┤
│ Operability │ Container-ready (Podman/Docker)
│ │ Gunicorn-compatible WSGI
│ │ Request tracing via X-Request-ID
└────────────────────────────────┴────────────────────────────────────────────┘
```
## Scope
### In Scope
- REST API for paste CRUD operations
- Text and binary content support
- Magic-byte MIME type detection
- Client certificate authentication (via proxy header)
- Configurable size limits (anon vs authenticated)
- Time-based expiry with access-touch semantics
- Content-hash deduplication for abuse prevention
- Security headers (HSTS, CSP, X-Frame-Options, etc.)
- Request tracing and structured logging
- Container deployment support
- SQLite storage
### Out of Scope
- Web UI / HTML frontend
- User registration / account management
- Syntax highlighting
- Paste forking / versioning
- Public paste listing / discovery
- Rate limiting per IP (delegated to reverse proxy)
- Multi-node clustering / distributed storage
- Alternative storage backends (S3, PostgreSQL)
## Constraints
- **Single process** - SQLite limits concurrency; scale via multiple containers
- **Reverse proxy required** - Client cert auth requires TLS termination
- **No web UI** - API-only; CLI tools (curl, httpie) are the interface
- **Ephemeral by design** - Pastes expire; not for permanent storage
## Assumptions
- Deployment behind a TLS-terminating reverse proxy
- Client certificates managed externally (PKI, mTLS)
- Operators have container runtime (Podman/Docker) or Python venv
- SQLite performance sufficient for expected load
## Technical Stack
```
┌─────────────────┬──────────────────────────────────────────────────────────┐
│ Component │ Technology
├─────────────────┼──────────────────────────────────────────────────────────┤
│ Framework │ Flask 3.x
│ Database │ SQLite 3 (built-in)
│ WSGI Server │ Gunicorn (production)
│ Container │ Podman / Docker
│ Testing │ pytest, pytest-cov
│ Python │ 3.11+
└─────────────────┴──────────────────────────────────────────────────────────┘
```

145
ROADMAP.md Normal file
View File

@@ -0,0 +1,145 @@
# FlaskPaste Roadmap
## Current State
FlaskPaste v1.0 is feature-complete for its core mission: a secure, minimal pastebin API.
**Implemented:**
- Full REST API (CRUD operations)
- Binary content support with magic-byte MIME detection
- Client certificate authentication
- Content-hash deduplication (abuse prevention)
- Automatic paste expiry
- Security headers and request tracing
- Container deployment support
- Comprehensive test suite
## Phase 1: Hardening (Current)
Focus: Production readiness and operational excellence.
```
┌───┬─────────────────────────────────────┬────────────────────────────────────┐
│ # │ Milestone │ Status
├───┼─────────────────────────────────────┼────────────────────────────────────┤
│ 1 │ Abuse prevention (dedup) │ Implemented (pending commit)
│ 2 │ Security headers complete │ Done
│ 3 │ Request tracing (X-Request-ID) │ Done
│ 4 │ Proxy trust validation │ Done
│ 5 │ Test coverage > 90% │ In progress
│ 6 │ Documentation complete │ In progress
└───┴─────────────────────────────────────┴────────────────────────────────────┘
```
## Phase 2: Operations
Focus: Deployment, monitoring, and maintenance tooling.
```
┌───┬─────────────────────────────────────┬────────────────────────────────────┐
│ # │ Milestone │ Dependencies
├───┼─────────────────────────────────────┼────────────────────────────────────┤
│ 1 │ Prometheus metrics endpoint │ None
│ 2 │ Structured JSON logging │ None
│ 3 │ Admin API (stats, cleanup) │ Auth improvements
│ 4 │ Ansible deployment role │ None
│ 5 │ CI/CD pipeline │ Container registry access
└───┴─────────────────────────────────────┴────────────────────────────────────┘
```
### Prometheus Metrics
Expose `/metrics` endpoint with:
- `flaskpaste_pastes_total` (counter)
- `flaskpaste_pastes_created` (counter)
- `flaskpaste_pastes_deleted` (counter)
- `flaskpaste_pastes_expired` (counter)
- `flaskpaste_storage_bytes` (gauge)
- `flaskpaste_request_duration_seconds` (histogram)
### Structured Logging
Replace text logs with JSON format:
- Timestamp, level, message, request_id
- Consistent field names across all log entries
- Compatible with log aggregation (Loki, ELK)
## Phase 3: Features
Focus: User-requested enhancements within scope.
```
┌───┬─────────────────────────────────────┬────────────────────────────────────┐
│ # │ Feature │ Complexity
├───┼─────────────────────────────────────┼────────────────────────────────────┤
│ 1 │ Paste encryption (server-side) │ Medium
│ 2 │ Custom expiry per paste │ Low
│ 3 │ Paste size in response headers │ Low
│ 4 │ Burn-after-read option │ Low
│ 5 │ Paste password protection │ Medium
└───┴─────────────────────────────────────┴────────────────────────────────────┘
```
### Burn-After-Read
Single-access pastes that delete after first retrieval:
- `POST /` with `X-Burn-After-Read: true` header
- Paste deleted after first `GET /<id>/raw`
- Metadata `GET /<id>` does not trigger burn
### Custom Expiry
Allow per-paste expiry override:
- `POST /` with `X-Expiry: 3600` header (seconds)
- Capped at server maximum (e.g., 30 days)
- Default unchanged for pastes without header
## Phase 4: Ecosystem
Focus: Integration with external systems.
```
┌───┬─────────────────────────────────────┬────────────────────────────────────┐
│ # │ Integration │ Purpose
├───┼─────────────────────────────────────┼────────────────────────────────────┤
│ 1 │ CLI client (fpaste) │ User convenience
│ 2 │ Neovim/Vim plugin │ Editor integration
│ 3 │ Shell aliases/functions │ Workflow integration
│ 4 │ Webhook notifications │ Automation triggers
└───┴─────────────────────────────────────┴────────────────────────────────────┘
```
### CLI Client
Standalone Python CLI:
- `fpaste < file.txt` - Create paste from stdin
- `fpaste file.txt` - Create paste from file
- `fpaste -g <id>` - Get paste
- `fpaste -d <id>` - Delete paste
- Config file for server URL and cert path
## Non-Goals (Explicit)
These features will not be implemented:
- **Web UI** - Out of scope; use API directly
- **User accounts** - PKI handles identity
- **Syntax highlighting** - Client responsibility
- **Search/discovery** - Pastes are private by design
- **Clustering** - Scale via container orchestration
- **S3/PostgreSQL backend** - SQLite is sufficient
## Decision Log
| Date | Decision | Rationale
|------------|------------------------------------|-----------------------------------------
| 2024-11 | SQLite only | Simplicity; no external dependencies
| 2024-11 | No web UI | API-first; reduces attack surface
| 2024-11 | Client cert auth | Integrates with existing PKI
| 2024-12 | Content-hash dedup | Prevent spam without IP tracking
## Review Schedule
- **Monthly**: Review TODO.md, refine TASKLIST.md
- **Quarterly**: Evaluate roadmap phases, adjust priorities
- **Yearly**: Major version planning, scope review

61
TASKLIST.md Normal file
View File

@@ -0,0 +1,61 @@
# Task List
Prioritized, actionable tasks. Each task is small and completable in one session.
---
## Priority 1: Pending Commit
| Status | Task
|--------|--------------------------------------------------------------
| ☐ | Commit abuse prevention feature (dedup changes in routes.py, config.py, database.py)
| ☐ | Commit documentation updates (api.md, README.md)
## Priority 2: Test Coverage
| Status | Task
|--------|--------------------------------------------------------------
| ☐ | Run test suite, verify all tests pass
| ☐ | Add test for dedup window expiry behavior
| ☐ | Add test for concurrent identical submissions
| ☐ | Add test for MIME detection edge cases (empty content, truncated headers)
| ☐ | Measure and document test coverage percentage
## Priority 3: Documentation
| Status | Task
|--------|--------------------------------------------------------------
| ☐ | Add deployment examples to documentation/deployment.md
| ☐ | Document environment variables in one canonical location
| ☐ | Add troubleshooting section to README.md
| ☐ | Create CONTRIBUTING.md with development setup
## Priority 4: Operations
| Status | Task
|--------|--------------------------------------------------------------
| ☐ | Add SQLite WAL mode for better concurrency
| ☐ | Add /metrics endpoint skeleton (for future Prometheus)
| ☐ | Add structured logging option via environment variable
| ☐ | Optimize container build with multi-stage Containerfile
## Completed
| Date | Task
|------------|--------------------------------------------------------------
| 2024-12 | Implement content-hash deduplication
| 2024-12 | Add X-Proxy-Secret validation
| 2024-12 | Add X-Request-ID tracing
| 2024-11 | Implement security headers
| 2024-11 | Add client certificate authentication
| 2024-11 | Create test suite
---
## Task Guidelines
- Tasks should be completable in < 2 hours
- Each task results in one atomic commit
- Mark ☑ when complete, move to Completed section
- Remove tasks that become irrelevant
- Pull new tasks from TODO.md as capacity allows

50
TODO.md Normal file
View File

@@ -0,0 +1,50 @@
# TODO
Unstructured intake buffer for ideas, issues, and observations. Items here are raw and unrefined. Actionable items should be promoted to TASKLIST.md.
---
## Ideas
- Prometheus metrics endpoint (`/metrics`) for monitoring integration
- Structured JSON logging for log aggregation compatibility
- Burn-after-read paste option
- Custom expiry header for per-paste TTL
- CLI client tool (fpaste) for easier usage
- Rate limit headers in responses (X-RateLimit-*)
- Paste compression for large text content
- Optional paste encryption with user-provided key
- ETag support for conditional requests
- HEAD method support for metadata without body
- Paste listing for authenticated users (their own pastes only)
## Observations
- Current abuse prevention uses content-hash; IP-based limiting delegated to proxy
- SQLite WAL mode could improve concurrent read performance
- Container image size could be reduced with multi-stage build
- Test coverage could include more edge cases for MIME detection
## Questions
- Should expired paste cleanup run in-process or via external cron?
- Is SQLite sufficient for anticipated load, or plan for PostgreSQL?
- Should burn-after-read pastes show in metadata before burn?
- Password-protected pastes: derive key from password or store hash?
## Debt
- Dedup feature changes pending commit
- Documentation could include more deployment examples
- No integration tests for container deployment
- Missing test for concurrent paste creation
## External Dependencies
- Consider adding `python-magic` for better MIME detection (currently magic bytes only)
- Evaluate `structlog` for structured logging when implemented
- Look into `prometheus-flask-exporter` for metrics
---
*Review weekly. Promote actionable items to TASKLIST.md. Archive or delete stale items.*