01Computer Vision: Capture screenshots of screens, windows, or regions with built-in OCR for text recognition.
02Template Matching: Find and interact with non-textual UI elements (icons, shapes) using image templates.
03Local & Private: Executes 100% locally with no data or screenshots ever sent to external servers.
04Input Simulation: Simulate mouse clicks, drags, scrolls, and keyboard typing for natural interaction.
05Window Management: List open windows, find applications, and bring them into focus for targeted automation.
0625 GitHub stars