docs: document voice plugin commands

Add Voice STT/TTS section covering !listen, !say, and optional
[voice] config block with whisper_url, piper_url, silence_gap.
This commit is contained in:
user
2026-02-22 03:08:10 +01:00
parent 9fbf45f67d
commit 7b9359c152

View File

@@ -1677,3 +1677,40 @@ file (natural dedup).
- Use `!kept` to list preserved files and their sizes
- Use `!kept clear` to delete all preserved files
- On cancel/error, files are not deleted (needed for `!resume`)
### Voice STT/TTS
Transcribe voice from Mumble users via Whisper STT and speak text aloud
via Piper TTS. Requires local Whisper and Piper services.
```
!listen [on|off] Toggle voice-to-text transcription (admin)
!listen Show current listen status
!say <text> Speak text aloud via TTS (max 500 chars)
```
STT behavior:
- When enabled, the bot buffers incoming voice PCM per user
- After a configurable silence gap (default 1.5s), the buffer is
transcribed via Whisper and posted as an action message
- Utterances shorter than 0.5s are discarded (noise filter)
- Utterances are capped at 30s to bound memory and latency
- Transcription results are posted as: `* derp heard Alice say: hello`
- The listener survives reconnects when `!listen` is on
TTS behavior:
- `!say` fetches WAV from Piper and plays it via `stream_audio()`
- Piper outputs 22050Hz WAV; ffmpeg resamples to 48kHz automatically
- TTS shares the audio output with music playback
- Text is limited to 500 characters
Configuration (optional):
```toml
[voice]
whisper_url = "http://192.168.122.1:8080/inference"
piper_url = "http://192.168.122.1:5000/"
silence_gap = 1.5
```