Speech icon

Speech

CreatedKvadratni

Enables voice interaction with modern audio visualization for the Goose platform.

About

Speech enhances the Goose platform by providing a voice interface, allowing users to interact with Goose through speech rather than text. This extension includes real-time audio processing for speech recognition via faster-whisper, high-quality text-to-speech with multiple voice options through pyttsx3 and Kokoro TTS, a modern PyQt-based UI with audio visualization, and a simple command-line interface for voice interaction. It streamlines interaction and offers an alternative hands-free control method for Goose.

Key Features

  • Voice input using faster-whisper for speech-to-text
  • Modern PyQt-based UI with audio visualization
  • Voice output with over 54 voice options via Kokoro TTS
  • Audio/video transcription with optional timestamps and speaker detection
  • Multi-speaker narration for creating dialogues and stories
  • 21 GitHub stars

Use Cases

  • Voice-controlled interaction with Goose
  • Transcribing audio and video files
  • Creating audio narrations with multiple voices