Fetches or generates YouTube video transcripts using AI, prioritizing official transcripts and falling back to local Whisper transcription.
This tool operates as a lightweight Model Context Protocol (MCP) server, offering a robust solution for acquiring YouTube video transcripts. It intelligently prioritizes fetching pre-existing, official transcripts directly from YouTube for speed and accuracy. Should an official transcript not be available, the server seamlessly downloads the video's audio and leverages OpenAI's Whisper model for local, AI-powered transcription, with an emphasis on utilizing `whisper.cpp` for enhanced speed. Designed for straightforward integration, it exposes its transcription capabilities via a simple tool accessible to any system capable of communicating with an MCP server.