The Chromium Desktop skill allows Claude to interact with a live, visible web browser within a GUI desktop environment. By leveraging system tools like xdotool, wmctrl, and xclip, the skill can focus windows, navigate to URLs, simulate keyboard shortcuts, and perform coordinate-based mouse clicks. This is particularly effective for workflows where visual feedback is essential, as it supports real-time screen capture for verification and handles complex multi-byte character input via clipboard pasting, bridging the gap between headless automation and manual desktop browsing.
주요 기능
01Precise window management and focus control using wmctrl
02Reliable multi-byte and Japanese text input using clipboard integration
03Full browser navigation including tab management and page scrolling
04Visual debugging and verification through automated scrot screenshots
05Simulated human-like keyboard and mouse interactions via xdotool
061 GitHub stars