This skill integrates industry-leading text-to-speech providers into the Claude Code environment, enabling the generation of high-quality audio files, voiceovers, and podcasts directly from the command line. It features advanced emotion control via Hume Octave, high-consistency character voices with Inworld TTS, and native multi-speaker conversation generation using Google Gemini 2.5. Whether you are creating accessible content, automated podcasts, or expressive character dialogue, this skill provides a unified interface for professional-grade audio synthesis with precise control over tone, speed, and delivery.
主な機能
01High-quality character voices with Inworld TTS and TTS Max models
020 GitHub stars
03Native multi-speaker podcast generation using Google Gemini 2.5
04Advanced emotion control using acting instructions and inline audio markups
05Customizable audio output formats and direct CLI playback integration
06Dynamic voice generation with emotional intelligence via Hume Octave