01Prompt compression using LLMLingua to reduce token usage by 50-80%
027 GitHub stars
03Semantic and code-aware chunking for high-fidelity RAG
04Native Anthropic token counting for precise context tracking
05Integration patterns for vector databases like ChromaDB
06Priority-based context window management for long conversations