소개
The Audio Transcriber skill integrates WhisperX into your workflow, allowing you to convert speech from various media formats into highly accurate text directly through Claude Code. It features advanced word-level alignment, automatic language detection, and supports multiple output formats including SRT and VTT for subtitles or JSON for structured data. With configurable model sizes ranging from 'tiny' to 'large-v2', users can easily balance processing speed with transcription accuracy depending on their hardware and quality requirements.