Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective
Zhezheng Hao*, Hong Wang*, Haoyang Liu, Jian Luo, Jiarui Yu, Hande Dong, Qiang Lin, Can Wang, and Jiawei Chen
In ACL main (Oral, Top 5% of the Accepted), 2026
We rethink entropy interventions in reinforcement learning from verbal reasoning, proposing an entropy change perspective for better training stability.