01Multi-modal capabilities for Flux image generation and Deepgram audio processing.
0291 GitHub stars
03AI Gateway configuration for request caching, logging, and granular cost tracking.
04Support for 2025 models including Llama 4, GPT-OSS 120B, and Mistral 3.1 Vision.
05Optimized RAG implementation using high-speed BGE and Gemma embeddings.
06Native streaming integration to eliminate worker timeouts and improve TTFT.