01Advanced prompt engineering techniques including ReAct, CoT, and DSPy
02Strict-mode function calling and parallel tool execution patterns
03Real-time streaming via SSE with backpressure and partial JSON handling
04Local inference optimization for Ollama and Apple Silicon hardware
05116 GitHub stars
06Fine-tuning workflows using LoRA/QLoRA and synthetic data generation