01SSRF Prevention: Enforces URL scheme validation and private IP rejection on DNS resolution results.
02Context Protection: Converts fetched content to structured Markdown with character limits and specific extraction modes (full, outline, metadata).
03Binary Handling: Processes images with server-side LLM analysis and extracts text from PDF, DOCX, XLSX, and PPTX files.
04Tenant Isolation: Secures access using JWT short-lived token authentication for deterministic access control.
05High Performance: Achieves parallel URL fetching and concurrent connection control via Playwright for efficient operation.
061 GitHub stars