01Supports 10 compression strategies with auto-selection and progressive delivery to save tokens.
02Proactively surfaces relevant memories from a memtomem LTM server, gated by relevance and rate limits.
030 GitHub stars
04Provides observability through Langfuse tracing, RPS, latency percentiles, and per-tool metrics.
05Implements response caching with configurable TTL and eviction, reapplying surfacing on cache hits.
06Enables horizontal scaling with `PendingStore` protocol, supporting InMemory or SQLite-shared backends.