소개
Luma is a multi-model vision understanding server designed to augment AI assistants that do not inherently support image comprehension. It acts as an MCP (Model Context Protocol) server, enabling AI tools like Claude Desktop, Cline, and Claude Code to perform visual analysis tasks. Luma supports powerful models such as GLM-4.5V for excellent Chinese understanding, DeepSeek-OCR for robust and free OCR capabilities, and Qwen3-VL-Flash for fast, cost-effective processing. It intelligently handles various scenarios including code screenshots, UI analysis, error diagnosis, and general OCR text recognition, making vision capabilities accessible to a wider range of AI applications.