CleanWeb icon

CleanWeb

Extracts and cleans web content, filtering ads and irrelevant elements, and converts it to a clean Markdown format for various applications.

소개

CleanWeb is a lightweight Model Context Protocol (MCP) server designed to intelligently process web pages. It excels at extracting core content, automatically filtering out advertisements, navigation, sidebars, and other distracting elements. Utilizing technologies like Axios, Cheerio, and Readability, it transforms complex HTML into a clean, readable Markdown format, with an option for JSON output including metadata. Its zero-browser dependency approach ensures simple and fast deployment, making it ideal for integration with AI assistants and other applications requiring optimized web content.

주요 기능

  • Smart content extraction using Axios, Cheerio, and Readability algorithms
  • Intelligent content cleaning, automatically removing ads and distractions
  • Converts extracted HTML content to clean Markdown format
  • Supports multiple output formats: pure Markdown or JSON with metadata
  • Lightweight deployment with no browser dependencies
  • 0 GitHub stars

사용 사례

  • Powering AI assistants with clean, summarized web content
  • Generating structured, ad-free data from web pages for analysis
  • Building lightweight web content scrapers and processors
Advertisement

Advertisement