01Cross-service signaling protocol for polite model unload requests
025 GitHub stars
03Automatic model unloading for idle services to proactively free up VRAM
04Robust OOM exception handling with configurable retry logic and backoff delays
05Implementation templates for PyTorch, Transformers, and shell scripts
06Pre-configured optimization settings for Ollama, ComfyUI, and Flux