Generates and plays high-quality multilingual audio locally using the mlx-audio Kokoro model.
This skill empowers Claude to convert text into spoken audio across nine different languages, making it an ideal companion for language learning, pronunciation verification, and accessibility. By utilizing the mlx-audio framework and the Kokoro model, it generates high-fidelity speech locally on your machine, offering 11 distinct voice profiles with adjustable speed controls. Whether you need to hear the correct pronunciation of a complex term or create audio snippets for educational content, this skill provides a seamless, low-latency workflow directly within your terminal environment.
Key Features
01Supports 9 languages including English, Spanish, Japanese, and Mandarin
02Adjustable playback speed ranging from 0.5x to 2.0x
035 GitHub stars
04Offers 11 high-quality voice profiles with American and British accents
05Seamless integration with macOS afplay for instant playback
06Automated local server management for efficient audio processing
Use Cases
01Verifying the pronunciation of technical terms or foreign language phrases
02Improving accessibility by reading text-heavy documentation or code comments aloud
03Developing audio-based language learning tools and study materials