MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 63
view article Article Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training By dragonkue • May 25 • 20
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 10 items • Updated Jul 7 • 94
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper • 2503.10615 • Published Mar 13 • 17
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 129
EXAONE-3.5 Collection EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B • 11 items • Updated Jul 7 • 119
LLaVA-OneVision Collection a model good at arbitrary types of visual input • 17 items • Updated Sep 17 • 31