Extracts structured semantic content from documents, audio, images, and video files using Azure's multimodal AI services.
This skill integrates the Azure AI Content Understanding SDK for Python into Claude Code, enabling the extraction of rich, structured data from various media types for RAG and automated workflows. It provides implementation patterns for prebuilt analyzers that generate Markdown for document search, transcribe audio with timing, and analyze video frames. Whether you are automating invoice processing or building sophisticated search indexes, this skill facilitates the use of Azure's asynchronous long-running operations and custom field schemas to turn unstructured media into actionable data.
주요 기능
01Multimodal content extraction from docs, images, audio, and video
02Asynchronous polling patterns for long-running AI operations
03Support for both synchronous and asynchronous Python clients
04Custom analyzer creation with specialized field schemas
0531,721 GitHub stars
06Prebuilt analyzers for RAG-ready Markdown and semantic search
사용 사례
01Transcribing and summarizing audio/video content with timestamped metadata
02Converting complex PDFs and documents into Markdown for RAG systems
03Automating data extraction from invoices and structured business documents