关于
Transform large volumes of scientific PDF literature into structured, validated databases ready for statistical analysis in Python, R, or SQLite. This skill provides a comprehensive pipeline that manages the entire research lifecycle, from metadata organization and abstract filtering to vision-based extraction using customizable schemas. It features automated JSON repair, enrichment via external databases like GBIF and NCBI, and a robust validation framework to calculate precision and recall metrics, ensuring high-quality, reproducible outputs for research and publication.