Fetcher icon

Fetcher

Createdjae-jae

Fetches web page content using a Playwright headless browser, enabling JavaScript execution and intelligent content extraction.

About

Fetcher is a powerful tool that leverages Playwright to fetch web page content, excelling at handling dynamic content and modern web applications. Unlike traditional scrapers, it executes JavaScript, extracts main content using a Readability algorithm, and supports HTML and Markdown output. With features like parallel processing and resource optimization, it efficiently retrieves data while minimizing bandwidth usage. Its robust error handling and configurable parameters ensure reliable operation for various use cases, including anti-crawler mechanisms and content retrieval adjustments.

Key Features

  • Intelligent content extraction using Readability algorithm
  • 416 GitHub stars
  • JavaScript Support via Playwright headless browser
  • Resource optimization by blocking unnecessary elements
  • Parallel processing for fetching multiple URLs
  • Supports HTML and Markdown output formats

Use Cases

  • Automated web content extraction from dynamic websites
  • Bypassing anti-crawler mechanisms by waiting for complete page loading
  • Batch retrieval of web page content for data analysis