Speech
CreatedKvadratni
Enables voice interaction with modern audio visualization for the Goose platform.
About
Speech enhances the Goose platform by providing a voice interface, allowing users to interact with Goose through speech rather than text. This extension includes real-time audio processing for speech recognition via faster-whisper, high-quality text-to-speech with multiple voice options through pyttsx3 and Kokoro TTS, a modern PyQt-based UI with audio visualization, and a simple command-line interface for voice interaction. It streamlines interaction and offers an alternative hands-free control method for Goose.
Key Features
- Voice input using faster-whisper for speech-to-text
- Modern PyQt-based UI with audio visualization
- Voice output with over 54 voice options via Kokoro TTS
- Audio/video transcription with optional timestamps and speaker detection
- Multi-speaker narration for creating dialogues and stories
- 21 GitHub stars
Use Cases
- Voice-controlled interaction with Goose
- Transcribing audio and video files
- Creating audio narrations with multiple voices