ReasoningTrap/MATH500
			Viewer
			• 
	
				Updated
					
				• 
			
			50
	
				• 
					
					33
				
				
				
None defined yet.
  Fine-grain evaluation & Large Reasoning Models that fails in reasoning due to reasoning rigidity.
  ConditionedMath (AIME & MATH500) · PuzzleTrivial · Zero-shot pipelines
Current RL-tuned Reasoning LLMs excel at producing answers but often ignore explicit user constraints.
ReasoningTrap surfaces these failure modes with carefully crafted, conditioned problems.