Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published 9 days ago • 35
Think Only When You Need with Large Hybrid-Reasoning Models Paper • 2505.14631 • Published May 20 • 19
TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence Paper • 2505.24500 • Published 27 days ago • 12
Closed-Form Bounds for DP-SGD against Record-level Inference Paper • 2402.14397 • Published Feb 22, 2024
Analyzing Leakage of Personally Identifiable Information in Language Models Paper • 2302.00539 • Published Feb 1, 2023
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective Paper • 2505.21505 • Published 29 days ago • 18
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Paper • 2505.21500 • Published 29 days ago • 12
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning Paper • 2304.03916 • Published Apr 8, 2023
Diversity of Thought Improves Reasoning Abilities of Large Language Models Paper • 2310.07088 • Published Oct 11, 2023 • 5
Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models Paper • 2404.06209 • Published Apr 9, 2024 • 5
Eureka: Evaluating and Understanding Large Foundation Models Paper • 2409.10566 • Published Sep 13, 2024
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction Paper • 2410.22584 • Published Oct 29, 2024 • 1
Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper • 2505.14604 • Published May 20 • 23
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Paper • 2505.14684 • Published May 20 • 23
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Paper • 2505.15801 • Published May 21 • 17