01Complete reference for 170+ non-deprecated llama.cpp API functions
024 GitHub stars
03Detailed implementation patterns for text generation, embeddings, and chat
04Optimized workflows for GGUF model loading and GPU/CPU performance tuning
05Guidance for KV cache management, state saving/loading, and LoRA adapters
06Documentation for over 25 sampling strategies including adaptive-p and XTC