01Reduces cloud costs by processing long contexts locally
02Offers streaming output with custom callback functions
03Supports OpenAI and TogetherAI API keys for cloud models
04Compatible with Ollama and Tokasaurus for running local models
05Supports collaboration between on-device and cloud LLMs
060 GitHub stars