UI-TARS Desktop
Createdbytedance
Control your computer using natural language with this GUI agent application.
About
UI-TARS Desktop is a GUI agent application powered by a Vision-Language Model, enabling users to interact with and control their computers using natural language commands. It supports screenshot and visual recognition, providing precise mouse and keyboard control across Windows and MacOS platforms. With real-time feedback and fully local processing, UI-TARS Desktop offers a private and secure way to automate desktop tasks.
Key Features
- Natural language control powered by Vision-Language Model
- Screenshot and visual recognition support
- Precise mouse and keyboard control
- Cross-platform support (Windows/MacOS)
- Real-time feedback and status display
- 9,755 GitHub stars
Use Cases
- Interacting with web browsers using natural language
- Controlling desktop applications through voice or text commands
- Automating repetitive tasks on a computer