Builds automated, AI-powered data collection agents that scrape, enrich, and store data from any public source for free.
This skill enables the creation of robust, production-ready data monitoring systems using a completely free infrastructure stack. It automates the entire lifecycle of data collection—from scraping public websites and APIs using BeautifulSoup or Playwright to enriching results with Gemini Flash for relevance scoring and summarization. By leveraging GitHub Actions for scheduling and a feedback-driven learning system, it creates an autonomous agent that improves its accuracy over time while syncing results directly to Notion, Google Sheets, or Supabase.
Características Principales
01Scheduled execution via GitHub Actions for 100% free hosting
02Automated scraping for HTML, JS-rendered sites, APIs, and RSS feeds
03AI enrichment using free-tier Gemini Flash with automatic model fallback
04Batch processing architecture to maximize LLM rate limits and efficiency
05Feedback-loop system that learns and improves scoring from user decisions
06172,009 GitHub stars
Casos de Uso
01Automated job board monitoring with relevance scoring based on a resume
02Summarizing and classifying news feeds or GitHub repository updates
03Product price tracking and competitive intelligence alerts