FunASR
0
Provides speech processing services, including audio validation, speech transcription, and voice activity detection, using Alibaba's FunASR library.
소개
FunASR leverages Alibaba's FunASR library within the FastMCP framework to offer robust speech processing capabilities. It includes tools for validating audio files, transcribing speech asynchronously with advanced ASR models like Paraformer (including detailed timestamps), and detecting voice activity. Designed to be extensible, it supports dynamic loading and switching of ASR and VAD models, configurable model parameters, and easy integration using MCP clients or HTTP requests.
주요 기능
- Validates audio file integrity and format.
- Performs asynchronous speech-to-text transcription.
- Manages transcription tasks, allowing status queries and result retrieval.
- Detects voice activity, returning precise start and end timestamps of speech segments.
- Supports dynamic loading and configuration of ASR and VAD models.
- 0 GitHub stars
사용 사례
- Validating audio files before processing.
- Transcribing long audio files asynchronously.
- Integrating voice activity detection into speech processing pipelines.