概要
This skill integrates local speech-to-text capabilities into the Claude environment by interfacing with a whisper.cpp server. It allows users to convert audio recordings, voice notes, and media files into high-accuracy text using the large-v3 model. Optimized for NVIDIA GPUs with CUDA support, it enables real-time transcription, shell-based recording, and Python integration, making it an essential tool for developers and researchers who require private, high-performance transcription without relying on external cloud APIs.