01Integrated SFT regularization to maintain model capabilities and prevent forgetting
02384 GitHub stars
03Customizable loss functions including sigmoid and hinge with adjustable target margins
04Memory-efficient workflows optimized for 7B, 8B, and 70B parameter models
05Reference-free preference optimization requiring no baseline model during training
06Superior alignment performance with +6.4 point gains on AlpacaEval 2.0 over DPO