01Multi-component reward weight rebalancing for PPO training
02Automated calibration of slippage and transaction cost penalties
030 GitHub stars
04Introduction of trading incentives to balance asymmetric payoffs
05Directional threshold optimization to ensure profit exceeds costs
06Dynamic exploration bonus scaling based on model uncertainty