A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Paper • 2510.15444 • Published Oct 17 • 145
FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory Paper • 2510.02335 • Published Sep 26 • 2
FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory Paper • 2510.02335 • Published Sep 26 • 2
LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM Paper • 2502.06572 • Published Feb 10 • 1
ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning Paper • 2412.13682 • Published Dec 18, 2024 • 7
ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning Paper • 2412.13682 • Published Dec 18, 2024 • 7
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper • 2502.04404 • Published Feb 6 • 25
LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model Paper • 2406.04614 • Published Jun 7, 2024 • 2