01OpenAI API compatible server for local LLMs
02Python API for easy integration into Python applications
03CLI for LLM prompting, accuracy measurement, benchmarking, and profiling
04NPU acceleration
05Supports PyTorch, ONNX, and GGUF frameworks
0636 GitHub stars