ParGo: Bridging Vision-Language with Partial and Global Views Paper • 2408.12928 • Published Aug 23, 2024
WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild? Paper • 2505.11015 • Published May 16
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting Paper • 2505.14059 • Published May 20 • 2
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting Paper • 2505.14059 • Published May 20 • 2
Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning Paper • 2505.14677 • Published May 20 • 15