view article Article Gaia2 Leaderboard Update: New Models and New Observations By meta-agents-research-environments and 3 others • Oct 2 • 10
view article Article ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases By QuentinJG and 4 others • 2 days ago • 33
view article Article ModernVBERT: Towards Smaller Visual Document Retrievers By paultltc and 4 others • Oct 3 • 43
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published Aug 8 • 40
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents Paper • 2508.13186 • Published Aug 14 • 18