Enables on-device and cloud language models to collaborate, reducing cloud costs while maintaining quality by reading long contexts locally.
Calista Research LLMs (formerly Minions) facilitates cost-efficient collaboration between on-device and cloud language models. It employs a communication protocol where small, local models work with powerful cloud-based models. By enabling local models to read long contexts, the tool minimizes the need for cloud processing, thereby reducing cloud costs. The repository includes a demonstration of the protocol and examples of how to set up and use the system with various local and remote LLM clients.