Audio Transcriber icon

Audio Transcriber

Transcribes audio files and live microphone input into text using advanced AI models.

概要

Audio Transcriber is a robust Python-based tool designed to convert spoken audio into written text. It supports a variety of common audio file formats, including WAV, MP4, MP3, and FLAC, and also offers the capability to record audio directly from a microphone for real-time transcription. Leveraging OpenAI's Whisper models, it provides high-quality, multilingual transcription with configurable model sizes to balance speed and accuracy. The tool integrates seamlessly with MCP Server, offering agentic AI support to enable sophisticated, automated audio processing workflows.

主な機能

  • Record audio directly from microphone
  • Support for OpenAI Whisper models (tiny, base, small, medium, large)
  • Agentic AI support through MCP Server for integrated automation
  • 2 GitHub stars
  • Transcribe multiple audio file formats (.wav, .mp4, .mp3, .flac)
  • Export transcriptions to TXT, SRT, and VTT formats

ユースケース

  • Integrating real-time speech-to-text capabilities into AI agents or automated systems
  • Generating text transcripts from recorded lectures, interviews, or meetings
  • Creating captions and subtitles (SRT/VTT) for video content
Advertisement

Advertisement