V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction Paper • 2503.17736 • Published Mar 22 • 1
Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models Paper • 2510.01304 • Published Oct 1 • 10
CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios Paper • 2506.13977 • Published Jun 11 • 10
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning Paper • 2505.22019 • Published May 28 • 11
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning Paper • 2504.07956 • Published Apr 10 • 47