Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published 24 days ago • 161
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs Paper • 2505.13529 • Published May 18 • 11
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 104
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published Jan 13 • 99
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 84
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks Paper • 2407.02855 • Published Jul 3, 2024 • 13
Prompt-Driven LLM Safeguarding via Directed Representation Optimization Paper • 2401.18018 • Published Jan 31, 2024 • 1
CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation Paper • 2208.08845 • Published Aug 18, 2022
PsyQA: A Chinese Dataset for Generating Long Counseling Text for Mental Health Support Paper • 2106.01702 • Published Jun 3, 2021
On Large Language Models' Selection Bias in Multi-Choice Questions Paper • 2309.03882 • Published Sep 7, 2023
Exploring Prompt-based Few-shot Learning for Grounded Dialog Generation Paper • 2109.06513 • Published Sep 14, 2021
Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning Paper • 2306.03350 • Published Jun 6, 2023
EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training Paper • 2108.01547 • Published Aug 3, 2021
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training Paper • 2203.09313 • Published Mar 17, 2022