Processes academic papers, patents, and technical documents from various formats into a structured, searchable research knowledge base.
The ingest skill for ScholarAIO automates the complex pipeline of transforming raw research materials—including PDFs, Office documents, and patents—into AI-ready markdown. It handles advanced OCR via MinerU, automated metadata extraction, and deduplication using DOIs or Patent Public Numbers. Whether managing a single research paper or a multi-volume conference proceeding, this skill streamlines the transition from a messy inbox to a fully enriched research terminal, supporting specific workflows for academic theses, technical reports, and high-volume document sets.
主な機能
01Customizable processing presets for ingestion, re-indexing, or full content enrichment
02Specialized pipelines for academic papers, patents, and conference proceedings
03Multi-format support for PDF, DOCX, XLSX, and PPTX automated conversion
04Intelligent document segmenting and structural cleaning for complex proceedings
05265 GitHub stars
06Automated metadata extraction and deduplication using DOI and patent public numbers
ユースケース
01Converting technical Office documents into structured Markdown for RAG or AI analysis
02Building a local searchable library from a folder of academic PDFs and conference papers
03Standardizing a patent repository by extracting metadata and removing duplicates automatically