Gremlin Web Scraper icon

Gremlin Web Scraper

Scrapes and crawls visible text content from public web pages using a lightweight HTTP module designed for the VS Code Model Context Protocol.

About

Gremlin Web Scraper is a robust, hybrid Python and JavaScript-powered MCP runtime specifically designed for extracting human-readable content from the web. Leveraging Flask and BeautifulSoup, it integrates seamlessly with VS Code's Model Context Protocol (MCP) system, enabling local execution of scraping and crawling operations via simple JSON-based HTTP requests. This tool provides a powerful, automated solution for data extraction, handling various web content types and forming a foundational part of the broader GremlinOS Runtime Suite.

Key Features

  • Seamless integration with VS Code's Model Context Protocol (MCP).
  • Simple JSON-based API for single-page scraping and recursive crawling.
  • Graceful handling of slow or problematic websites with built-in timeouts and error management.
  • Support for cross-origin requests (CORS-ready).
  • Comprehensive activity logging to rotating files using `loguru`.
  • 1 GitHub stars

Use Cases

  • Extracting readable text content from specific web pages.
  • Automating recursive crawls of same-domain links for content collection.
  • Integrating web scraping functionalities directly into VS Code-based development workflows.
Advertisement

Advertisement