World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models Paper • 2511.22787 • Published 11 days ago • 8
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published 15 days ago • 29
GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation Paper • 2512.01801 • Published 7 days ago • 23
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published 13 days ago • 25
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published 19 days ago • 42
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models Paper • 2511.09515 • Published 26 days ago • 17
The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published 27 days ago • 31
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning Paper • 2509.15937 • Published Sep 19 • 20
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions Paper • 2509.06951 • Published Sep 8 • 31
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Paper • 2508.21148 • Published Aug 28 • 140
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 208
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18 • 88
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published Aug 11 • 49