01Structured data extraction using LLM-based instructions or JSON schemas
02Multi-page deep crawling with breadth-first and depth-first strategies
030 GitHub stars
04Robust handling of JavaScript-heavy SPAs using custom wait conditions
05Pre-configured settings for common platforms like Obsidian Publish and slow sites
06Advanced markdown extraction with boilerplate filtering (md-fit)