01Intelligent model routing based on task complexity and token thresholds
02Narrow retry logic that targets transient errors while failing fast on bad requests
03Standardized pipeline composition for multi-model architectures
04Native prompt caching implementation for reduced latency and cost
05Immutable budget tracking using frozen dataclasses for reliable auditing
060 GitHub stars