Search Search results for "audio" - Page 5 of

Kokoro Text To Speech

Generates MP3 files from text using the Kokoro-TTS model and optionally uploads them to S3.

API Development

FFmpeg

Enables interaction with FFmpeg for common media operations via a stdio MCP server.

Developer Tools

119

RunwayML Luma AI

Enables video, image, and audio generation through RunwayML and Luma AI APIs using text and image prompts.

API Development

Ableton Copilot

Enables AI assistants to control Ableton Live in real-time through a standardized protocol interface.

API Development

ElevenLabs Scribe

Provides a Model Control Protocol (MCP) server implementation for real-time speech-to-text transcription using the ElevenLabs Scribe API.

API Development

Qiniu Cloud Storage

Upload files to Qiniu Cloud Storage for easy referencing of audio and image content.

Developer Tools

Video Recognition

Analyze images, audio, and videos using Google's Gemini AI.

API Development

Local STT

Provides local speech-to-text transcription using whisper.cpp, optimized for Apple Silicon.

Developer Tools

ElevenLabs Enhanced

Provides an enhanced Model Context Protocol (MCP) server for interacting with ElevenLabs' text-to-speech and audio processing APIs, specifically designed for conversational AI agents.

API Development

BiliMind

Transforms Bilibili video content into structured notes, provides intelligent Q&A, and transcribes audio.

Productivity & Workflow

AI Studio

Provides a Model Context Protocol server integrating Google AI Studio and Gemini API for multi-modal content generation, file processing, PDF-to-Markdown conversion, image analysis, and audio transcription.

API Development

Strudel

Enables AI models like Claude to directly control Strudel.cc for AI-assisted music generation and live coding.

Developer Tools

182

Yak

Transforms coding agents into voice-enabled companions by providing real-time audio notifications and interactions.

Developer Tools

Voice Generation

Converts text to speech using the Minimax AI API and automatically uploads generated audio files to Amazon S3.

API Development

VideoCutter

Integrates video, audio, and image processing with advanced AI and MCP protocol support for intelligent, natural language-driven media content creation.

API Development

Deepgram

Provides an MCP server to access Deepgram's advanced speech recognition and text-to-speech functionalities.

API Development

ScreenPal Video Transcriber

Transcribe ScreenPal videos using local AI models, generating comprehensive audio transcripts and visual descriptions without cloud dependencies.

Productivity & Workflow

Apple Voice Memos

Provides programmatic access to Apple Voice Memos on macOS, enabling AI assistants to interact with voice recordings.

Developer Tools

Voice Soundboard

Provides a production-ready multi-voice text-to-speech library and MCP server for AI assistants, enabling real-time streaming, SSML, emotional speech, and sound effects.

API Development

Voice Soundboard

Provides an AI-powered text-to-speech library for generating expressive, human-like voices for AI agents, featuring multi-voice synthesis, real-time streaming, SSML support, and sound effects.

API Development

Web Extract

Enables open-source web scraping and multimodal data extraction for AI model context and data gathering.

Developer Tools