Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published 5 days ago • 51
SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction Paper • 2507.15852 • Published 16 days ago • 37
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 29 days ago • 611
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published Jun 24 • 50
ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing Paper • 2506.19848 • Published Jun 24 • 26
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Paper • 2506.14429 • Published Jun 17 • 45
VideoRoPE: What Makes for Good Video Rotary Position Embeddi Collection A storage repo for VideoRoPE. • 6 items • Updated Jun 17 • 3
view article Article Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training By codelion • May 17 • 5
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published Jun 19, 2024 • 17
MM-IFEngine Collection [ICCV 2025] Official Implementation of "MM-IFEngine: Towards Multimodal Instruction Following" • 2 items • Updated 21 days ago • 5
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 129
MM-IFEngine: Towards Multimodal Instruction Following Paper • 2504.07957 • Published Apr 10 • 34
Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published Apr 3 • 57
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published Mar 28 • 46
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7 • 124