Does it support function calling during a voice session?

Yes, it includes detailed patterns for defining function tools and handling function call updates within the real-time event loop.

What is the Azure.AI.VoiceLive SDK?

It is a .NET library designed for building real-time voice applications using Azure AI's multimodal models with low-latency WebSocket communication.

What audio format is recommended for input/output?

The SDK typically uses 16-bit PCM (Pcm16) at a 24kHz sample rate in mono for optimal performance and compatibility.

Which models are supported by this skill?

This skill supports gpt-4o-realtime-preview, gpt-4o-mini-realtime-preview, and phi4-mm-realtime models.

Do I need an API key to use this?

While API keys are supported, the skill recommends using Microsoft Entra ID (via DefaultAzureCredential) for more secure and enterprise-ready authentication.

Azure AI Voice Live for .NET

Name: Azure AI Voice Live for .NET
Author: sickn33

bysickn33

•

24,382

•

データサイエンスとML

Enables real-time, bidirectional voice AI capabilities in .NET applications using Azure AI and WebSocket communication.

This skill provides comprehensive guidance and implementation patterns for the Azure.AI.VoiceLive SDK, allowing developers to build sophisticated voice assistants within the .NET ecosystem. It covers essential workflows including session configuration, real-time audio streaming, semantic voice activity detection (VAD), and complex function calling integration. This skill is particularly useful for developers creating low-latency, conversational AI experiences that require high-performance C# implementations and integration with Azure's latest multimodal models like GPT-4o Realtime.

主な機能

01Managed authentication using Microsoft Entra ID and Azure Key Credentials

02Real-time bidirectional WebSocket communication for low-latency voice interactions

03Advanced Semantic Voice Activity Detection (VAD) for natural conversation flow

04Seamless integration with GPT-4o and Phi-4 multimodal real-time models

0524,382 GitHub stars

06Support for complex function calling within live voice sessions

ユースケース

01Creating hands-free industrial or medical voice-activated AI assistants

02Developing interactive AI customer service voice bots and virtual agents

03Building real-time accessibility tools and speech-to-speech translators

主な機能

01Managed authentication using Microsoft Entra ID and Azure Key Credentials

02Real-time bidirectional WebSocket communication for low-latency voice interactions

03Advanced Semantic Voice Activity Detection (VAD) for natural conversation flow

04Seamless integration with GPT-4o and Phi-4 multimodal real-time models

0524,382 GitHub stars

06Support for complex function calling within live voice sessions

ユースケース

01Creating hands-free industrial or medical voice-activated AI assistants

02Developing interactive AI customer service voice bots and virtual agents

03Building real-time accessibility tools and speech-to-speech translators