011 GitHub stars
02Comprehensive performance metrics (tokens/sec, TTFT, memory usage)
03Quality evaluation across 6 categories (reasoning, coding, math, etc.)
04Global score (0-100) and verdict based on hardware fit and quality
05One-click sharing of results to a public leaderboard with rank
06Integration with CLI, MCP server, and IDE plugins (Claude Code, Cursor)