소개
This skill provides a comprehensive interface for interacting with the Taiga evaluation platform API, specifically tailored for Claude Code. It allows developers to query job statuses, aggregate problem pass rates, and retrieve detailed execution transcripts directly through Python-based API requests. By enforcing best practices—such as utilizing the Claude 4.5 Opus model for submissions and robust cookie-based authentication—it streamlines the workflow for evaluating AI model performance across various environments and problem sets, ensuring reliable data collection and analysis.