From 4cb29fa3d260fcc62d120cb41dfe774af473164b Mon Sep 17 00:00:00 2001 From: Username Date: Sat, 20 Dec 2025 03:31:37 +0100 Subject: [PATCH] add project structure files --- PROJECT.md | 104 +++++++++++++++++++++++++++++++++++++ ROADMAP.md | 145 ++++++++++++++++++++++++++++++++++++++++++++++++++++ TASKLIST.md | 61 ++++++++++++++++++++++ TODO.md | 50 ++++++++++++++++++ 4 files changed, 360 insertions(+) create mode 100644 PROJECT.md create mode 100644 ROADMAP.md create mode 100644 TASKLIST.md create mode 100644 TODO.md diff --git a/PROJECT.md b/PROJECT.md new file mode 100644 index 0000000..a729a44 --- /dev/null +++ b/PROJECT.md @@ -0,0 +1,104 @@ +# FlaskPaste + +## Purpose + +FlaskPaste is a lightweight, security-hardened pastebin REST API for self-hosted deployments. It provides a minimal, dependency-light alternative to public pastebin services, designed for environments where data privacy, authentication control, and operational simplicity are priorities. + +## Problem Statement + +Public pastebin services present risks: +- Data sovereignty concerns (content stored on third-party infrastructure) +- Limited authentication options +- No control over retention policies +- Abuse/spam from other users affecting service reliability +- Feature bloat and complex UIs when only an API is needed + +## Solution + +A self-hosted pastebin API that: +- Stores pastes locally in SQLite +- Supports client certificate authentication via reverse proxy +- Automatically expires content based on access patterns +- Prevents abuse through content-hash deduplication +- Serves text and binary content with proper MIME detection +- Runs behind any reverse proxy (nginx, HAProxy, Caddy) + +## Success Criteria + +``` +┌────────────────────────────────┬────────────────────────────────────────────┐ +│ Criterion │ Metric +├────────────────────────────────┼────────────────────────────────────────────┤ +│ Security │ Zero injection vulnerabilities +│ │ All OWASP headers implemented +│ │ Input validation on all endpoints +├────────────────────────────────┼────────────────────────────────────────────┤ +│ Reliability │ SQLite ACID guarantees +│ │ Graceful degradation on errors +│ │ Health check endpoint for monitoring +├────────────────────────────────┼────────────────────────────────────────────┤ +│ Simplicity │ Single dependency (Flask) +│ │ SQLite for storage (no external DB) +│ │ Environment-based configuration +├────────────────────────────────┼────────────────────────────────────────────┤ +│ Operability │ Container-ready (Podman/Docker) +│ │ Gunicorn-compatible WSGI +│ │ Request tracing via X-Request-ID +└────────────────────────────────┴────────────────────────────────────────────┘ +``` + +## Scope + +### In Scope + +- REST API for paste CRUD operations +- Text and binary content support +- Magic-byte MIME type detection +- Client certificate authentication (via proxy header) +- Configurable size limits (anon vs authenticated) +- Time-based expiry with access-touch semantics +- Content-hash deduplication for abuse prevention +- Security headers (HSTS, CSP, X-Frame-Options, etc.) +- Request tracing and structured logging +- Container deployment support +- SQLite storage + +### Out of Scope + +- Web UI / HTML frontend +- User registration / account management +- Syntax highlighting +- Paste forking / versioning +- Public paste listing / discovery +- Rate limiting per IP (delegated to reverse proxy) +- Multi-node clustering / distributed storage +- Alternative storage backends (S3, PostgreSQL) + +## Constraints + +- **Single process** - SQLite limits concurrency; scale via multiple containers +- **Reverse proxy required** - Client cert auth requires TLS termination +- **No web UI** - API-only; CLI tools (curl, httpie) are the interface +- **Ephemeral by design** - Pastes expire; not for permanent storage + +## Assumptions + +- Deployment behind a TLS-terminating reverse proxy +- Client certificates managed externally (PKI, mTLS) +- Operators have container runtime (Podman/Docker) or Python venv +- SQLite performance sufficient for expected load + +## Technical Stack + +``` +┌─────────────────┬──────────────────────────────────────────────────────────┐ +│ Component │ Technology +├─────────────────┼──────────────────────────────────────────────────────────┤ +│ Framework │ Flask 3.x +│ Database │ SQLite 3 (built-in) +│ WSGI Server │ Gunicorn (production) +│ Container │ Podman / Docker +│ Testing │ pytest, pytest-cov +│ Python │ 3.11+ +└─────────────────┴──────────────────────────────────────────────────────────┘ +``` diff --git a/ROADMAP.md b/ROADMAP.md new file mode 100644 index 0000000..744aa84 --- /dev/null +++ b/ROADMAP.md @@ -0,0 +1,145 @@ +# FlaskPaste Roadmap + +## Current State + +FlaskPaste v1.0 is feature-complete for its core mission: a secure, minimal pastebin API. + +**Implemented:** +- Full REST API (CRUD operations) +- Binary content support with magic-byte MIME detection +- Client certificate authentication +- Content-hash deduplication (abuse prevention) +- Automatic paste expiry +- Security headers and request tracing +- Container deployment support +- Comprehensive test suite + +## Phase 1: Hardening (Current) + +Focus: Production readiness and operational excellence. + +``` +┌───┬─────────────────────────────────────┬────────────────────────────────────┐ +│ # │ Milestone │ Status +├───┼─────────────────────────────────────┼────────────────────────────────────┤ +│ 1 │ Abuse prevention (dedup) │ Implemented (pending commit) +│ 2 │ Security headers complete │ Done +│ 3 │ Request tracing (X-Request-ID) │ Done +│ 4 │ Proxy trust validation │ Done +│ 5 │ Test coverage > 90% │ In progress +│ 6 │ Documentation complete │ In progress +└───┴─────────────────────────────────────┴────────────────────────────────────┘ +``` + +## Phase 2: Operations + +Focus: Deployment, monitoring, and maintenance tooling. + +``` +┌───┬─────────────────────────────────────┬────────────────────────────────────┐ +│ # │ Milestone │ Dependencies +├───┼─────────────────────────────────────┼────────────────────────────────────┤ +│ 1 │ Prometheus metrics endpoint │ None +│ 2 │ Structured JSON logging │ None +│ 3 │ Admin API (stats, cleanup) │ Auth improvements +│ 4 │ Ansible deployment role │ None +│ 5 │ CI/CD pipeline │ Container registry access +└───┴─────────────────────────────────────┴────────────────────────────────────┘ +``` + +### Prometheus Metrics + +Expose `/metrics` endpoint with: +- `flaskpaste_pastes_total` (counter) +- `flaskpaste_pastes_created` (counter) +- `flaskpaste_pastes_deleted` (counter) +- `flaskpaste_pastes_expired` (counter) +- `flaskpaste_storage_bytes` (gauge) +- `flaskpaste_request_duration_seconds` (histogram) + +### Structured Logging + +Replace text logs with JSON format: +- Timestamp, level, message, request_id +- Consistent field names across all log entries +- Compatible with log aggregation (Loki, ELK) + +## Phase 3: Features + +Focus: User-requested enhancements within scope. + +``` +┌───┬─────────────────────────────────────┬────────────────────────────────────┐ +│ # │ Feature │ Complexity +├───┼─────────────────────────────────────┼────────────────────────────────────┤ +│ 1 │ Paste encryption (server-side) │ Medium +│ 2 │ Custom expiry per paste │ Low +│ 3 │ Paste size in response headers │ Low +│ 4 │ Burn-after-read option │ Low +│ 5 │ Paste password protection │ Medium +└───┴─────────────────────────────────────┴────────────────────────────────────┘ +``` + +### Burn-After-Read + +Single-access pastes that delete after first retrieval: +- `POST /` with `X-Burn-After-Read: true` header +- Paste deleted after first `GET //raw` +- Metadata `GET /` does not trigger burn + +### Custom Expiry + +Allow per-paste expiry override: +- `POST /` with `X-Expiry: 3600` header (seconds) +- Capped at server maximum (e.g., 30 days) +- Default unchanged for pastes without header + +## Phase 4: Ecosystem + +Focus: Integration with external systems. + +``` +┌───┬─────────────────────────────────────┬────────────────────────────────────┐ +│ # │ Integration │ Purpose +├───┼─────────────────────────────────────┼────────────────────────────────────┤ +│ 1 │ CLI client (fpaste) │ User convenience +│ 2 │ Neovim/Vim plugin │ Editor integration +│ 3 │ Shell aliases/functions │ Workflow integration +│ 4 │ Webhook notifications │ Automation triggers +└───┴─────────────────────────────────────┴────────────────────────────────────┘ +``` + +### CLI Client + +Standalone Python CLI: +- `fpaste < file.txt` - Create paste from stdin +- `fpaste file.txt` - Create paste from file +- `fpaste -g ` - Get paste +- `fpaste -d ` - Delete paste +- Config file for server URL and cert path + +## Non-Goals (Explicit) + +These features will not be implemented: + +- **Web UI** - Out of scope; use API directly +- **User accounts** - PKI handles identity +- **Syntax highlighting** - Client responsibility +- **Search/discovery** - Pastes are private by design +- **Clustering** - Scale via container orchestration +- **S3/PostgreSQL backend** - SQLite is sufficient + +## Decision Log + +| Date | Decision | Rationale +|------------|------------------------------------|----------------------------------------- +| 2024-11 | SQLite only | Simplicity; no external dependencies +| 2024-11 | No web UI | API-first; reduces attack surface +| 2024-11 | Client cert auth | Integrates with existing PKI +| 2024-12 | Content-hash dedup | Prevent spam without IP tracking + +## Review Schedule + +- **Monthly**: Review TODO.md, refine TASKLIST.md +- **Quarterly**: Evaluate roadmap phases, adjust priorities +- **Yearly**: Major version planning, scope review diff --git a/TASKLIST.md b/TASKLIST.md new file mode 100644 index 0000000..6b6346c --- /dev/null +++ b/TASKLIST.md @@ -0,0 +1,61 @@ +# Task List + +Prioritized, actionable tasks. Each task is small and completable in one session. + +--- + +## Priority 1: Pending Commit + +| Status | Task +|--------|-------------------------------------------------------------- +| ☐ | Commit abuse prevention feature (dedup changes in routes.py, config.py, database.py) +| ☐ | Commit documentation updates (api.md, README.md) + +## Priority 2: Test Coverage + +| Status | Task +|--------|-------------------------------------------------------------- +| ☐ | Run test suite, verify all tests pass +| ☐ | Add test for dedup window expiry behavior +| ☐ | Add test for concurrent identical submissions +| ☐ | Add test for MIME detection edge cases (empty content, truncated headers) +| ☐ | Measure and document test coverage percentage + +## Priority 3: Documentation + +| Status | Task +|--------|-------------------------------------------------------------- +| ☐ | Add deployment examples to documentation/deployment.md +| ☐ | Document environment variables in one canonical location +| ☐ | Add troubleshooting section to README.md +| ☐ | Create CONTRIBUTING.md with development setup + +## Priority 4: Operations + +| Status | Task +|--------|-------------------------------------------------------------- +| ☐ | Add SQLite WAL mode for better concurrency +| ☐ | Add /metrics endpoint skeleton (for future Prometheus) +| ☐ | Add structured logging option via environment variable +| ☐ | Optimize container build with multi-stage Containerfile + +## Completed + +| Date | Task +|------------|-------------------------------------------------------------- +| 2024-12 | Implement content-hash deduplication +| 2024-12 | Add X-Proxy-Secret validation +| 2024-12 | Add X-Request-ID tracing +| 2024-11 | Implement security headers +| 2024-11 | Add client certificate authentication +| 2024-11 | Create test suite + +--- + +## Task Guidelines + +- Tasks should be completable in < 2 hours +- Each task results in one atomic commit +- Mark ☑ when complete, move to Completed section +- Remove tasks that become irrelevant +- Pull new tasks from TODO.md as capacity allows diff --git a/TODO.md b/TODO.md new file mode 100644 index 0000000..28ca724 --- /dev/null +++ b/TODO.md @@ -0,0 +1,50 @@ +# TODO + +Unstructured intake buffer for ideas, issues, and observations. Items here are raw and unrefined. Actionable items should be promoted to TASKLIST.md. + +--- + +## Ideas + +- Prometheus metrics endpoint (`/metrics`) for monitoring integration +- Structured JSON logging for log aggregation compatibility +- Burn-after-read paste option +- Custom expiry header for per-paste TTL +- CLI client tool (fpaste) for easier usage +- Rate limit headers in responses (X-RateLimit-*) +- Paste compression for large text content +- Optional paste encryption with user-provided key +- ETag support for conditional requests +- HEAD method support for metadata without body +- Paste listing for authenticated users (their own pastes only) + +## Observations + +- Current abuse prevention uses content-hash; IP-based limiting delegated to proxy +- SQLite WAL mode could improve concurrent read performance +- Container image size could be reduced with multi-stage build +- Test coverage could include more edge cases for MIME detection + +## Questions + +- Should expired paste cleanup run in-process or via external cron? +- Is SQLite sufficient for anticipated load, or plan for PostgreSQL? +- Should burn-after-read pastes show in metadata before burn? +- Password-protected pastes: derive key from password or store hash? + +## Debt + +- Dedup feature changes pending commit +- Documentation could include more deployment examples +- No integration tests for container deployment +- Missing test for concurrent paste creation + +## External Dependencies + +- Consider adding `python-magic` for better MIME detection (currently magic bytes only) +- Evaluate `structlog` for structured logging when implemented +- Look into `prometheus-flask-exporter` for metrics + +--- + +*Review weekly. Promote actionable items to TASKLIST.md. Archive or delete stale items.*