概要
The GPU monitoring skill provides specialized tools for tracking and optimizing hardware performance when running Ollama inference. It enables developers to monitor NVIDIA and AMD GPU status, VRAM usage, and critical inference metrics such as tokens per second and evaluation durations. By providing automated health checks and troubleshooting guides for common issues like CPU fallback or Out-of-Memory (OOM) errors, this skill ensures that local AI models are running at peak efficiency on available hardware.