This project transforms Selenium WebDriver into an MCP (Model Context Protocol) server, creating a vital bridge that allows AI agents and Large Language Models to interact with and control real browsers. It empowers AI to perform a wide range of complex web tasks, including opening browsers, navigating websites, discovering UI elements, clicking buttons, typing into inputs, extracting page text, and capturing screenshots. This tool is instrumental for building intelligent browser automation systems, autonomous QA agents, and LLM-powered copilots, effectively addressing the gap between traditional web automation tools and the interactive needs of modern AI.
主な機能
01Comprehensive browser session and navigation controls
02Page text extraction and screenshot capture
03UI element discovery and accessibility-aware interaction
040 GitHub stars
05Built-in production-grade system prompt for AI agents
06MCP-compatible Selenium automation server