Extracts key-value pairs from arbitrary, noisy, or unstructured text using LLMs and provides type-safe output in JSON, YAML, or TOML formats.
Sponsored
This MCP server leverages LLMs (GPT-4.1-mini) and pydantic-ai to extract key-value pairs from unstructured text, even when it's noisy or complex. Its automatic key discovery identifies and extracts relevant data without predefined keys, making it ideal for diverse and unpredictable datasets. The server's multi-step pipeline, including multilingual preprocessing with spaCy NER, LLM-based type annotation and evaluation, and Pydantic validation, ensures type safety and consistent output in JSON, YAML, or TOML formats, enhancing extraction accuracy and reliability for downstream applications.
主な機能
01Multi-lingual preprocessing with spaCy NER (Japanese, English, Chinese)
020 GitHub stars
03Automatic key discovery from unstructured text
04Robust processing of noisy and complex inputs
05Type-safe output via Pydantic validation
06Multiple output formats: JSON, YAML, and TOML
ユースケース
01Extracting information from unstructured documents