docs: document voice plugin commands
Add Voice STT/TTS section covering !listen, !say, and optional [voice] config block with whisper_url, piper_url, silence_gap.
This commit is contained in:
@@ -1677,3 +1677,40 @@ file (natural dedup).
|
||||
- Use `!kept` to list preserved files and their sizes
|
||||
- Use `!kept clear` to delete all preserved files
|
||||
- On cancel/error, files are not deleted (needed for `!resume`)
|
||||
|
||||
### Voice STT/TTS
|
||||
|
||||
Transcribe voice from Mumble users via Whisper STT and speak text aloud
|
||||
via Piper TTS. Requires local Whisper and Piper services.
|
||||
|
||||
```
|
||||
!listen [on|off] Toggle voice-to-text transcription (admin)
|
||||
!listen Show current listen status
|
||||
!say <text> Speak text aloud via TTS (max 500 chars)
|
||||
```
|
||||
|
||||
STT behavior:
|
||||
|
||||
- When enabled, the bot buffers incoming voice PCM per user
|
||||
- After a configurable silence gap (default 1.5s), the buffer is
|
||||
transcribed via Whisper and posted as an action message
|
||||
- Utterances shorter than 0.5s are discarded (noise filter)
|
||||
- Utterances are capped at 30s to bound memory and latency
|
||||
- Transcription results are posted as: `* derp heard Alice say: hello`
|
||||
- The listener survives reconnects when `!listen` is on
|
||||
|
||||
TTS behavior:
|
||||
|
||||
- `!say` fetches WAV from Piper and plays it via `stream_audio()`
|
||||
- Piper outputs 22050Hz WAV; ffmpeg resamples to 48kHz automatically
|
||||
- TTS shares the audio output with music playback
|
||||
- Text is limited to 500 characters
|
||||
|
||||
Configuration (optional):
|
||||
|
||||
```toml
|
||||
[voice]
|
||||
whisper_url = "http://192.168.122.1:8080/inference"
|
||||
piper_url = "http://192.168.122.1:5000/"
|
||||
silence_gap = 1.5
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user