01Generates LLM-native Markdown/JSON for RAG pipelines
02Normalizes unstructured web data into strict Pydantic models
030 GitHub stars
04Optimized for serverless deployment on Google Cloud Run
05Advanced semantic extraction and HTML-to-Text parsing
06High-performance Python 3.13+ runtime with FastAPI/Pydantic V2