소개
This skill transforms Claude into an expert reinforcement learning assistant, specializing in the Stable Baselines3 (SB3) library. It provides domain-specific guidance for architecting custom Gymnasium environments, selecting the right RL algorithms like PPO, SAC, or DQN, and managing complex training workflows. Whether you are building autonomous agents for robotics or optimizing decision-making processes, this skill ensures implementation follows best practices for model persistence, performance monitoring, and parallelized training via vectorized environments.