About
This skill provides a comprehensive interface for the fal.ai audio ecosystem, enabling developers to implement sophisticated audio processing directly through Claude. It supports industry-leading speech-to-text models like OpenAI's Whisper for transcription and translation, alongside premium text-to-speech engines including ElevenLabs, F5-TTS, and Kokoro. Whether you are generating subtitles with precise timestamps, cloning voices from reference samples, or building multilingual speech pipelines, this skill offers the necessary endpoints, formatting patterns, and parameter guides to streamline your audio engineering workflow.