Files
ppf/.github/copilot-instructions.md
2025-12-20 16:46:44 +01:00

8.5 KiB
Raw Blame History

Copilot Instructions for Python 2 Projects

These instructions guide the development of Python 2 projects, ensuring compatibility, efficiency, and adherence to specific constraints. Follow these guidelines strictly. Note that Python 2.7 reached its official end-of-life on January 1, 2020, meaning no further official security patches or updates are provided by the Python Software Foundation. Projects must implement additional security measures, such as custom vulnerability assessments and third-party patching solutions if available, to mitigate risks.

Project Constraints

  1. Python 2 Exclusivity:

    • Code MUST be written exclusively in Python 2.7, the final version of Python 2.
    • The project CANNOT be migrated to Python 3 under any circumstances, even for security or feature enhancements.
    • Ensure full compatibility with Python 2.7 syntax and behaviors, including the print statement (without parentheses), xrange for iteration, raw_input for user input, and integer division (e.g., 5 / 2 == 2).
  2. Low-End Machine Optimization:

    • Assume operation on low-end hardware with minimal resources: limited RAM (e.g., 512MB2GB) and constrained CPU (e.g., single-core or low-frequency multi-core processors).
    • Prioritize memory and CPU efficiency in all code:
      • Employ generators like xrange over range to minimize memory allocation for large loops.
      • Steer clear of large data structures, deep recursion, or excessive object instantiation that could lead to out-of-memory errors.
      • Favor iterative approaches over recursive ones to avoid stack overflow on limited hardware.
      • Opt for lightweight data handling, such as plain text files or simple CSV parsing, and avoid resource-intensive formats.
  3. Library Restrictions:

    • Strictly avoid any additional external libraries (e.g., no numpy, requests, pandas, or similar third-party packages).
    • Rely solely on Python 2.7's standard library modules (e.g., os, sys, threading, queue, time, math, re, csv, logging).
    • For any required functionality, implement alternatives using standard library components or basic custom code.
    • Document all used standard library modules explicitly in code comments or project notes for transparency.
  4. Multi-Threading Efficiency:

    • Utilize the threading module for concurrency, keeping in mind Python 2.7's Global Interpreter Lock (GIL), which restricts true CPU parallelism.
    • Avoid time.sleep() calls entirely, as they introduce unnecessary delays and inefficiency on low-end systems; instead, use threading.Event for signaling, threading.Lock for synchronization, or threading.Semaphore for resource control.
    • Employ Queue.Queue for safe, thread-agnostic task passing and workload distribution.
    • Limit the number of threads created to reduce overhead; aim for 24 threads maximum based on hardware assumptions.
    • Focus threads on I/O-bound operations (e.g., file reads/writes) rather than CPU-intensive tasks to maximize efficiency under the GIL.

Best Practices

  1. Code Style and Readability:

    • Adhere to PEP 8 style guidelines adapted for Python 2, including 79-character line limits, consistent indentation (4 spaces), and descriptive variable/function names.
    • Provide comprehensive docstrings for functions, classes, and modules, detailing purpose, parameters, return values, and any threading or memory considerations.
    • Add inline comments for intricate sections, particularly those involving synchronization primitives or optimization techniques.
    • Use clear, semantic naming conventions (e.g., process_user_input instead of abbreviated or vague names).
  2. Error Handling and Security:

    • Implement robust try/except blocks for all potential failure points, such as I/O operations, threading issues, or user inputs; specify exception types (e.g., except IOError:) and avoid bare except: clauses.
    • Leverage the logging module from the standard library for error reporting and debugging, rather than print statements, to enable configurable log levels without performance overhead.
    • Given Python 2.7's EOL status, incorporate manual security checks: validate all inputs to prevent injection attacks, use hashlib for basic hashing where needed, and regularly audit code for known CVEs in standard library components.
  3. Performance Considerations:

    • Use built-in profiling tools like cProfile from the standard library to identify and optimize bottlenecks, focusing on memory leaks or high-CPU loops.
    • For string operations, prefer cStringIO (if available) or basic string methods over concatenation in loops to avoid quadratic time complexity.
    • Process large datasets in streams or chunks using iterators to keep memory usage low; avoid loading entire files into memory.
    • Minimize object copying by using references, in-place modifications, and efficient data structures like lists or dictionaries only when necessary.
  4. Threading Guidelines:

    • Assign threads to discrete, bounded tasks (e.g., one for monitoring input, another for background processing) to simplify debugging and reduce contention.
    • Rely on threading.Event for inter-thread communication and state changes (e.g., to pause/resume or terminate threads) instead of busy-waiting or polling.
    • Always ensure thread termination with threading.Thread.join() and timeouts to prevent hangs on resource-constrained systems.
    • Protect shared resources with threading.Lock to avoid race conditions; test for deadlocks in multi-threaded scenarios.
    • Consider a simple thread pool pattern using a fixed set of worker threads to manage load without excessive creation/destruction.
  5. File and Resource Management:

    • Use context managers (with statements) for files and other resources to guarantee automatic cleanup and prevent leaks.
    • Explicitly close non-context-managed resources (e.g., sockets via socket.close()) immediately after use.
    • Limit concurrent open files or connections to avoid exhausting system limits on low-end hardware.
    • For persistence, use simple formats like pickled objects (via pickle module) or text files, ensuring secure unpickling to mitigate risks.

Additional Notes

  • Testing: Validate code in a Python 2.7 environment emulating low-end hardware, such as a virtual machine with capped RAM and CPU cores. Include unit tests using the unittest standard library module, with emphasis on threading scenarios and memory usage.
  • Documentation: Create a comprehensive README.md file outlining setup (e.g., Python 2.7 installation), hardware assumptions, used standard library modules, and security considerations due to EOL.
  • Debugging: Employ the pdb module for interactive debugging sessions; avoid any external debuggers or IDEs that might introduce dependencies.
  • Portability and Security: Ensure cross-platform compatibility (Linux, Windows, macOS) without external tools. For security, integrate runtime checks using standard modules like sys and os to monitor resource usage, and recommend periodic scans for vulnerabilities using community tools (without direct integration).
  • Maintenance: Since Python 2.7 receives no official support post-2020, plan for custom maintenance: track community patches if available, and isolate the project environment (e.g., via virtualenv from standard library equivalents) to contain risks.

Suggested Improvements

  1. Memory Monitoring: Integrate checks with the resource module (Unix-like systems) or basic sys.getsizeof() calls to log and alert on high memory usage during runtime.
  2. Thread Pooling: Implement a reusable pool of a small number of threads (e.g., via a manager class) to handle variable workloads efficiently without repeated thread spawning.
  3. Graceful Shutdown: Use the signal module to catch interrupts (e.g., SIGINT for Ctrl+C) and propagate them to threads via events, ensuring clean resource release.
  4. Modularity and Reusability: Break code into small, independent modules to facilitate testing, reduce load times, and allow selective optimization for specific hardware constraints.
  5. Security Enhancements: Add input sanitization routines using re for pattern matching and basic encryption/decryption via hashlib or Crypto alternatives implemented manually if needed, to address unpatched vulnerabilities.

By strictly following these guidelines, Python 2 projects will maintain efficiency, readability, and functionality within the defined constraints, while acknowledging the inherent risks of using an unsupported language version.