概要
This skill empowers developers to architect and implement production-grade voice AI experiences, focusing on minimizing latency and maximizing user engagement. It provides expert patterns for integrating the OpenAI Realtime API, Vapi for hosted agents, and custom pipelines using Deepgram for transcription and ElevenLabs for synthesis. Whether you're building a phone-based support agent or a web-based voice interface, this skill guides you through critical concepts like streaming audio, voice activity detection (VAD), and barge-in handling to ensure a seamless, human-like interaction.