01Scalable distributed training configurations for models up to 671B parameters
02Specialized workflows for math reasoning and vision-language model training
03Support for advanced RL algorithms including GRPO, PPO, RLOO, and REINFORCE++
043,983 GitHub stars
05Flexible backend integration with FSDP, Megatron-LM, vLLM, and SGLang
06Support for multi-turn rollout and agentic tool-calling reinforcement learning