Files
tuimble/reports/code-review-2026-02-24.md
2026-02-24 16:23:11 +01:00

739 lines
22 KiB
Markdown

# tuimble Code Review
**Date:** 2026-02-24
**Scope:** Full codebase review (1,750 LOC across 7 modules)
**Constraints:** No theme/layout redesign. Targeted, incremental improvements only.
---
## 1. Architecture & Design
### [CRITICAL] `app.py` is a 999-line monolith
`TuimbleApp` directly owns connection lifecycle, audio management, reconnection
logic, config reload state machine, input history, PTT wiring, and all UI
updates. This makes it hard to test, hard to reason about, and hard to extend.
**Specific extractions worth making:**
#### a) Reconnection manager
The reconnect loop, backoff logic, attempt counting, and cancellation form a
self-contained state machine currently scattered across `_reconnect_loop`,
`_log_reconnect`, `_on_reconnect_success`, `_on_reconnect_exhausted`,
`_cancel_reconnect`, plus three boolean flags (`_reconnecting`,
`_reconnect_attempt`, `_intentional_disconnect`).
```python
# src/tuimble/reconnect.py
from __future__ import annotations
import time
import logging
from dataclasses import dataclass
from typing import Callable
log = logging.getLogger(__name__)
INITIAL_DELAY = 2
MAX_DELAY = 30
MAX_RETRIES = 10
@dataclass
class ReconnectState:
active: bool = False
attempt: int = 0
intentional: bool = False
class ReconnectManager:
"""Exponential backoff reconnection with cancellation."""
def __init__(
self,
connect_fn: Callable[[], None],
on_attempt: Callable[[int, float], None],
on_success: Callable[[], None],
on_failure: Callable[[int, str], None],
on_exhausted: Callable[[], None],
):
self._connect = connect_fn
self._on_attempt = on_attempt
self._on_success = on_success
self._on_failure = on_failure
self._on_exhausted = on_exhausted
self.state = ReconnectState()
def cancel(self) -> None:
self.state.active = False
self.state.attempt = 0
@property
def delay(self) -> float:
return min(
INITIAL_DELAY * (2 ** (self.state.attempt - 1)),
MAX_DELAY,
)
def run(self) -> None:
"""Blocking reconnect loop -- run in a worker thread."""
self.state.active = True
self.state.attempt = 0
while self.state.active:
self.state.attempt += 1
d = self.delay
self._on_attempt(self.state.attempt, d)
elapsed = 0.0
while elapsed < d and self.state.active:
time.sleep(0.5)
elapsed += 0.5
if not self.state.active:
break
try:
self._connect()
self.state.active = False
self.state.attempt = 0
self._on_success()
return
except Exception as exc:
self._on_failure(self.state.attempt, str(exc))
if self.state.attempt >= MAX_RETRIES:
self.state.active = False
self._on_exhausted()
return
```
Removes ~80 lines from `app.py` and makes the backoff logic unit-testable
without a TUI.
#### b) Input history
The history navigation in `on_key` (lines 928-953) is a classic "small feature
that grew legs." Extract to a reusable class:
```python
class InputHistory:
def __init__(self):
self._entries: list[str] = []
self._idx: int = -1
self._draft: str = ""
def push(self, text: str) -> None:
self._entries.append(text)
self._idx = -1
def up(self, current: str) -> str | None:
if not self._entries:
return None
if self._idx == -1:
self._draft = current
self._idx = len(self._entries) - 1
elif self._idx > 0:
self._idx -= 1
return self._entries[self._idx]
def down(self) -> str | None:
if self._idx == -1:
return None
if self._idx < len(self._entries) - 1:
self._idx += 1
return self._entries[self._idx]
self._idx = -1
return self._draft
```
---
### [IMPORTANT] Thread safety gaps in `_reconnect_loop`
`self._reconnecting` and `self._reconnect_attempt` are plain booleans/ints
accessed from both the main thread (via `_cancel_reconnect`,
`action_reload_config`) and the worker thread (`_reconnect_loop`). Python's GIL
makes this "mostly safe" for simple booleans, but it's fragile and not
guaranteed for compound operations.
`app.py:486-488`:
```python
while self._reconnecting: # read on worker thread
self._reconnect_attempt += 1 # read-modify-write on worker thread
```
Meanwhile `_cancel_reconnect` at line 572 writes both from the main thread.
Use `threading.Event` for the cancellation flag:
```python
self._reconnect_cancel = threading.Event()
# in _reconnect_loop:
while not self._reconnect_cancel.is_set():
...
# interruptible sleep:
if self._reconnect_cancel.wait(timeout=delay):
break
# in _cancel_reconnect:
self._reconnect_cancel.set()
```
This also eliminates the 0.5s sleep polling loop (lines 496-499), making
cancellation instant instead of up to 500ms delayed.
---
### [IMPORTANT] `MumbleClient` creates `User`/`Channel` objects on every property access
`client.py:112-141` -- The `users` and `channels` properties rebuild full
dictionaries from pymumble's internal state on every call.
`_refresh_channel_tree` in `app.py:688-699` calls both, then iterates all
users. During a debounced state update, this means three full traversals.
Consider caching with a dirty flag set by callbacks:
```python
@property
def users(self) -> dict[int, User]:
if self._users_dirty or self._users_cache is None:
self._users_cache = self._build_users()
self._users_dirty = False
return self._users_cache
```
---
### [NICE] Callback wiring pattern
`client.py:79-84` uses bare `None`-able attributes for callbacks:
```python
self.on_connected = None
self.on_disconnected = None
```
This works but lacks type safety and discoverability. Consider a typed protocol
or simple typed attributes:
```python
from typing import Protocol
class ClientCallbacks(Protocol):
def on_connected(self) -> None: ...
def on_disconnected(self) -> None: ...
def on_text_message(self, sender: str, text: str) -> None: ...
```
Not urgent -- current approach is pragmatic for the project size.
---
## 2. TUI/UX Best Practices
### [IMPORTANT] `_strip_html` recompiles regex on every message
`app.py:994-999`:
```python
def _strip_html(text: str) -> str:
import re
clean = re.sub(r"<[^>]+>", "", text)
return html.unescape(clean)
```
The `re` import inside the function is fine (cached by Python), but the pattern
is recompiled on every call. Compile once at module level:
```python
import re
_HTML_TAG_RE = re.compile(r"<[^>]+>")
def _strip_html(text: str) -> str:
return html.unescape(_HTML_TAG_RE.sub("", text))
```
---
### [IMPORTANT] PTT state logging is noisy
`app.py:969-973` -- Every PTT toggle writes to the chatlog. For toggle mode
this is acceptable, but if someone later implements hold-mode (evdev),
pressing/releasing space would flood the log with "transmitting"/"stopped
transmitting" on every keypress.
Add a config option or suppress log for hold-mode, or use the status bar alone:
```python
def _on_ptt_change(self, transmitting: bool) -> None:
self._audio.capturing = transmitting
status = self.query_one("#status", StatusBar)
status.ptt_active = transmitting
# Only log for toggle mode; hold-mode updates too frequently
if self._config.ptt.mode == "toggle":
chatlog = self.query_one("#chatlog", ChatLog)
if transmitting:
chatlog.write("[#e0af68] transmitting[/]")
else:
chatlog.write("[dim] stopped transmitting[/dim]")
```
---
### [NICE] `on_resize` sets sidebar width but doesn't refresh the channel tree
`app.py:977-981` -- After resize, the sidebar width changes but
`ChannelTree.render()` uses `self.content_size.width` which may not update
until the next render pass. The tree truncation logic depends on width. Call
`tree.refresh()` explicitly:
```python
def on_resize(self, event: events.Resize) -> None:
sidebar = self.query_one("#sidebar", ChannelTree)
sidebar.styles.width = max(16, min(32, event.size.width // 4))
sidebar.refresh()
self.query_one("#status", StatusBar).refresh()
```
---
## 3. Code Quality
### [CRITICAL] `load_config` passes raw TOML dicts to dataclass constructors without validation
`config.py:62-67`:
```python
if "server" in data:
cfg.server = ServerConfig(**data["server"])
```
If the TOML file contains an unexpected key (`ServerConfig(host="x",
typo_field="y")`), this raises an opaque `TypeError`. If a value has the wrong
type (`port = "abc"`), it silently accepts it and fails later.
Add defensive loading:
```python
def _load_section(cls, data: dict, section: str):
"""Load a dataclass section, ignoring unknown keys."""
raw = data.get(section, {})
import dataclasses
valid_keys = {f.name for f in dataclasses.fields(cls)}
filtered = {k: v for k, v in raw.items() if k in valid_keys}
return cls(**filtered)
# Usage:
cfg.server = _load_section(ServerConfig, data, "server")
```
This prevents crashes from typos in config files -- critical for a user-facing
tool.
---
### [IMPORTANT] Duplicate logic for creating `MumbleClient`
`MumbleClient` is instantiated in four places with the same pattern:
1. `__init__` (line 395)
2. `_reconnect_loop` (line 507)
3. `action_reload_config` disconnected path (line 864)
4. `_apply_restart_changes` (line 824)
Extract a factory method:
```python
def _make_client(self, srv: ServerConfig | None = None) -> MumbleClient:
srv = srv or self._config.server
return MumbleClient(
host=srv.host,
port=srv.port,
username=srv.username,
password=srv.password,
certfile=srv.certfile,
keyfile=srv.keyfile,
)
```
---
### [IMPORTANT] `ChannelTree._find_root` is called twice per render
`render()` at line 223 calls `_find_root()`, and `_build_channel_order()` at
line 173 also calls `_find_root()`. Both execute during `set_state()` then
`render()`. Cache the root ID:
```python
def set_state(self, channels, users_by_channel, my_channel_id=None):
self._channels = channels
self._users_by_channel = users_by_channel
self._my_channel_id = my_channel_id
self._root_id = self._find_root() if channels else 0
self._channel_ids = self._build_channel_order()
# ...
```
---
### [IMPORTANT] `_detect_config_changes` duplicates field-level comparison
`app.py:740-780` manually compares each field. If you add a new server field,
you must remember to add it to both `_detect_config_changes` AND
`_apply_restart_changes`. Use `dataclasses.asdict()` for comparison:
```python
from dataclasses import asdict
def _detect_config_changes(self, old: Config, new: Config):
safe, restart = [], []
for attr in ("key", "mode", "backend"):
if getattr(old.ptt, attr) != getattr(new.ptt, attr):
safe.append(
f"ptt.{attr}: {getattr(old.ptt, attr)} -> "
f"{getattr(new.ptt, attr)}"
)
for attr in ("input_gain", "output_gain"):
if getattr(old.audio, attr) != getattr(new.audio, attr):
safe.append(
f"audio.{attr}: {getattr(old.audio, attr)} -> "
f"{getattr(new.audio, attr)}"
)
if asdict(old.server) != asdict(new.server):
for k in asdict(old.server):
if getattr(old.server, k) != getattr(new.server, k):
label = "password" if k == "password" else k
restart.append(f"server.{label} changed")
for attr in ("input_device", "output_device", "sample_rate"):
if getattr(old.audio, attr) != getattr(new.audio, attr):
restart.append(f"audio.{attr} changed")
return safe, restart
```
---
### [NICE] Type annotations missing on several methods
- `client.py:86` `set_dispatcher(self, fn: Callable)` -- should be
`Callable | None`
- `client.py:93` `_dispatch(self, callback, *args)` -- no return type, no
`callback` type
- `client.py:154` `connect(self)` -- no return type annotation
- `audio.py:58` `start(self)` -- no return type
Add `-> None` where appropriate. Minor but compounds into unclear interfaces.
---
## 4. Performance
### [IMPORTANT] `_apply_gain` uses Python-level sample iteration
`audio.py:22-30` -- Unpacking, scaling, and repacking every sample via a list
comprehension in Python is slow for real-time audio (960 samples = 1920 bytes
per 20ms frame):
```python
samples = struct.unpack(fmt, pcm[: n * 2])
scaled = [max(-32768, min(32767, int(s * gain))) for s in samples]
return struct.pack(fmt, *scaled)
```
Use numpy (already a transitive dependency via sounddevice):
```python
import numpy as np
def _apply_gain(pcm: bytes, gain: float) -> bytes:
if not pcm:
return pcm
samples = np.frombuffer(pcm, dtype=np.int16)
scaled = np.clip(samples * gain, -32768, 32767).astype(np.int16)
return scaled.tobytes()
```
This is ~50-100x faster for a 960-sample frame. Since sounddevice already
pulls in numpy, there's no new dependency.
---
### [IMPORTANT] `_audio_send_loop` polls with `time.sleep(0.005)`
`app.py:676-684`:
```python
while self._client.connected:
frame = self._audio.get_capture_frame()
if frame is not None:
self._client.send_audio(frame)
else:
time.sleep(0.005)
```
5ms polling means up to 5ms of added latency on every frame, plus unnecessary
CPU wake-ups when idle. Use `queue.Queue.get(timeout=...)` instead:
```python
def get_capture_frame(self, timeout: float = 0.02) -> bytes | None:
try:
return self._capture_queue.get(timeout=timeout)
except queue.Empty:
return None
```
Then the send loop becomes:
```python
while self._client.connected:
frame = self._audio.get_capture_frame(timeout=0.02)
if frame is not None:
self._client.send_audio(frame)
```
No polling, no wasted cycles, deterministic wake on data arrival.
---
### [NICE] `ChannelTree.render()` rebuilds the entire tree string every refresh
For small channel lists (typical Mumble servers have 5-30 channels), this is
fine. But `_render_tree` is recursive and allocates many string fragments. If
performance becomes an issue, cache the rendered string and invalidate on
`set_state()`.
---
## 5. Scalability & Extensibility
### [IMPORTANT] No structured event bus
All communication between components flows through Textual's message system,
which is good. But the `_cb_*` callback wrappers in `app.py:462-475` are thin
boilerplate. Consider a mapping-driven approach:
```python
def _wire_client_callbacks(self) -> None:
self._client.set_dispatcher(self.call_from_thread)
self._client.on_connected = lambda: self.post_message(ServerConnected())
self._client.on_disconnected = lambda: self.post_message(ServerDisconnected())
self._client.on_text_message = (
lambda s, t: self.post_message(TextMessageReceived(s, t))
)
self._client.on_user_update = lambda: self.post_message(ServerStateChanged())
self._client.on_channel_update = (
lambda: self.post_message(ServerStateChanged())
)
self._client.on_sound_received = (
lambda _u, pcm: self._audio.queue_playback(pcm)
)
```
This eliminates six one-line methods. The audio callback bypasses the message
system correctly (it's hot-path, no UI update needed).
---
### [NICE] `Config` is mutable
Config dataclasses are mutable, and `_apply_safe_changes` directly mutates
`self._config.ptt`. Consider frozen dataclasses with a `replace()` pattern to
prevent accidental state corruption:
```python
@dataclass(frozen=True)
class PttConfig:
key: str = "f4"
mode: str = "toggle"
backend: str = "auto"
```
Then: `self._config = dataclasses.replace(self._config, ptt=new.ptt)`
---
## 6. Testing
### [CRITICAL] No tests for `TuimbleApp` or any widget
The test suite covers `audio.py`, `client.py`, `ptt.py`, and `config.py` --
all non-UI modules. There are zero tests for:
- `StatusBar` rendering (connection states, volume bars, responsive breakpoints)
- `ChannelTree` rendering (tree structure, truncation, focus navigation)
- `TuimbleApp` message handling (connect/disconnect/state change flows)
- Config reload state machine
- Input history navigation
Textual provides `App.run_test()` for async widget testing:
```python
import pytest
from textual.testing import AppTest
@pytest.mark.asyncio
async def test_status_bar_connected():
app = TuimbleApp()
async with app.run_test(size=(80, 24)) as pilot:
status = app.query_one("#status", StatusBar)
status.connected = True
await pilot.pause()
rendered = status.render()
assert "connected" in rendered
```
**Priority tests to add:**
1. `StatusBar.render()` at each width breakpoint (< 40, < 60, >= 60)
2. `ChannelTree` keyboard navigation (up/down/enter)
3. `_strip_html` with edge cases (nested tags, malformed HTML, entities)
4. `_next_volume` wraparound behavior
5. Config reload with unknown TOML keys (crash test)
6. `_detect_config_changes` safe vs restart classification
---
### [IMPORTANT] Tests access private attributes
`test_audio.py` directly accesses `ap._capture_queue`, `ap._playback_queue`,
`ap._sample_rate`. This couples tests to implementation. Where possible, test
through public interfaces:
```python
# Instead of: ap._capture_queue.put(b"\x01\x02\x03")
# Use the actual capture callback:
ap.capturing = True
ap._capture_callback(b"\x01\x02\x03", 1, None, None)
assert ap.get_capture_frame() == b"\x01\x02\x03"
```
---
## 7. Security & Robustness
### [CRITICAL] `_strip_html` regex is insufficient for Mumble HTML
`app.py:996-999` uses `re.sub(r"<[^>]+>", "", text)` which fails on:
- Malformed tags: `<img src="x" onerror="alert(1)"` (no closing `>`)
- Nested markup: `<a href="<script>">`
- CDATA/comments: `<!-- <script> -->`
While Textual's Rich markup won't execute scripts, malformed input could break
Rich's parser or produce garbled output. Consider using `html.parser` from the
stdlib:
```python
from html.parser import HTMLParser
class _HTMLStripper(HTMLParser):
def __init__(self):
super().__init__()
self._parts: list[str] = []
def handle_data(self, data: str):
self._parts.append(data)
def get_text(self) -> str:
return "".join(self._parts)
def _strip_html(text: str) -> str:
stripper = _HTMLStripper()
stripper.feed(text)
return html.unescape(stripper.get_text())
```
---
### [IMPORTANT] `join_channel` doesn't handle missing channel IDs
`client.py:240-243`:
```python
def join_channel(self, channel_id: int):
if self._mumble and self._connected:
self._mumble.channels[channel_id].move_in()
```
If `channel_id` is stale (channel was removed between tree render and user
pressing Enter), this raises `KeyError`. Add a guard:
```python
def join_channel(self, channel_id: int):
if self._mumble and self._connected:
ch = self._mumble.channels.get(channel_id)
if ch is None:
raise ValueError(f"channel {channel_id} not found")
ch.move_in()
```
---
### [IMPORTANT] `send_text` silently fails on missing channel
`client.py:229-233`:
```python
def send_text(self, message: str):
if self._mumble and self._connected:
ch = self._mumble.channels[self._mumble.users.myself["channel_id"]]
ch.send_text_message(message)
```
If `myself` has no `channel_id` key (edge case during connection setup), this
crashes with `KeyError`. Guard with try/except or a `my_channel_id` check
first.
---
### [NICE] `audio.py` `stop()` doesn't drain queues
After `stop()`, the capture and playback queues still hold stale frames. If
`start()` is called again (audio device hot-swap), old frames leak into the new
session:
```python
def stop(self):
for stream in (self._input_stream, self._output_stream):
if stream is not None:
stream.stop()
stream.close()
self._input_stream = None
self._output_stream = None
# Drain stale frames
while not self._capture_queue.empty():
try:
self._capture_queue.get_nowait()
except queue.Empty:
break
while not self._playback_queue.empty():
try:
self._playback_queue.get_nowait()
except queue.Empty:
break
```
---
## Priority Matrix
| # | Area | Severity | Item |
|----|---------------|-----------|------------------------------------------------------|
| 1 | Architecture | Critical | Extract reconnect manager from app.py |
| 2 | Security | Critical | Replace regex HTML stripping with proper parser |
| 3 | Robustness | Critical | Validate TOML config keys before dataclass init |
| 4 | Testing | Critical | Add widget/app tests using Textual's test harness |
| 5 | Performance | Important | Use numpy for `_apply_gain` (transitive dep exists) |
| 6 | Performance | Important | Replace polling sleep with `queue.get(timeout=)` |
| 7 | Thread safety | Important | Use `threading.Event` for reconnect cancellation |
| 8 | Code quality | Important | Extract `_make_client()` factory (4 dup sites) |
| 9 | Code quality | Important | Compile `_strip_html` regex at module level |
| 10 | Robustness | Important | Guard `join_channel`/`send_text` against `KeyError` |
| 11 | UX | Important | Suppress PTT chatlog spam for hold-mode |
| 12 | Code quality | Important | Deduplicate config change detection logic |
| 13 | Architecture | Nice | Extract `InputHistory` class |
| 14 | Architecture | Nice | Inline callback wrappers with lambdas |
| 15 | Robustness | Nice | Drain audio queues on `stop()` |
| 16 | Code quality | Nice | Add return type annotations throughout |
| 17 | Scalability | Nice | Consider frozen dataclasses for Config |