Kv Extractor
0
Extracts key-value pairs from arbitrary, noisy, or unstructured text using LLMs and provides type-safe output in JSON, YAML, or TOML formats.
About
This MCP server leverages LLMs (GPT-4.1-mini) and pydantic-ai to extract key-value pairs from unstructured text, even when it's noisy or complex. Its automatic key discovery identifies and extracts relevant data without predefined keys, making it ideal for diverse and unpredictable datasets. The server's multi-step pipeline, including multilingual preprocessing with spaCy NER, LLM-based type annotation and evaluation, and Pydantic validation, ensures type safety and consistent output in JSON, YAML, or TOML formats, enhancing extraction accuracy and reliability for downstream applications.
Key Features
- Multi-lingual preprocessing with spaCy NER (Japanese, English, Chinese)
- 0 GitHub stars
- Automatic key discovery from unstructured text
- Robust processing of noisy and complex inputs
- Type-safe output via Pydantic validation
- Multiple output formats: JSON, YAML, and TOML
Use Cases
- Extracting information from unstructured documents
- Automating data entry from noisy sources
- Parsing key-value pairs from multilingual text