The Browser Agent Protocol (BAP) offers a standardized, AI-optimized method for agents to control web browsers. It employs JSON-RPC 2.0 over WebSocket for robust communication and leverages semantic selectors, like accessibility roles and text content, instead of brittle CSS selectors, making interactions more reliable and comprehensible for AI. BAP is built with an accessibility-first approach, provides token-efficient observations for LLMs, and integrates seamlessly with the Model Context Protocol (MCP). It supports composite actions, stable element references, screenshot annotations for vision models, multi-context browser sessions, and optional human-in-the-loop approval workflows, ensuring powerful and controlled browser automation for AI agents.
主な機能
01AI-Optimized observations with token efficiency and screenshot annotation
023 GitHub stars
03Multi-Context Support for parallel isolated browser sessions
04Semantic Selectors for AI comprehension (e.g., accessibility roles, text content)
05MCP Integration for seamless agent communication
06Composite Actions for multi-step sequences