-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 41 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 13 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 26
Collections
Discover the best community collections!
Collections including paper arXiv:2510.25741
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 526 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 468 -
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
Paper • 2508.18106 • Published • 342 -
Intern-S1: A Scientific Multimodal Foundation Model
Paper • 2508.15763 • Published • 255
-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 28 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 262 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 42 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 154
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper • 2105.09501 • Published -
Cross-modal Contrastive Learning for Speech Translation
Paper • 2205.02444 • Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper • 2210.03052 • Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper • 2212.10240 • Published • 1
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 216 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Emu3.5: Native Multimodal Models are World Learners
Paper • 2510.26583 • Published • 102 -
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging
Paper • 2510.20479 • Published • 10 -
A Definition of AGI
Paper • 2510.18212 • Published • 33 -
Video-As-Prompt: Unified Semantic Control for Video Generation
Paper • 2510.20888 • Published • 44
-
Language Models Model Language
Paper • 2510.12766 • Published • 23 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 526 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 468 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 202
-
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 132 -
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 137 -
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Paper • 2508.15144 • Published • 64 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 154
-
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Paper • 2504.02821 • Published • 9 -
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
Paper • 2504.17343 • Published • 13 -
ViSMaP: Unsupervised Hour-long Video Summarisation by Meta-Prompting
Paper • 2504.15921 • Published • 7 -
Causal-Copilot: An Autonomous Causal Analysis Agent
Paper • 2504.13263 • Published • 7
-
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Paper • 2508.07629 • Published • 41 -
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
Paper • 2508.07101 • Published • 13 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
Paper • 2508.08940 • Published • 26
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 216 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Emu3.5: Native Multimodal Models are World Learners
Paper • 2510.26583 • Published • 102 -
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging
Paper • 2510.20479 • Published • 10 -
A Definition of AGI
Paper • 2510.18212 • Published • 33 -
Video-As-Prompt: Unified Semantic Control for Video Generation
Paper • 2510.20888 • Published • 44
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 526 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 468 -
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
Paper • 2508.18106 • Published • 342 -
Intern-S1: A Scientific Multimodal Foundation Model
Paper • 2508.15763 • Published • 255
-
Language Models Model Language
Paper • 2510.12766 • Published • 23 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 526 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 468 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 202
-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 28 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 262 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 42 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 154
-
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper • 2507.15846 • Published • 132 -
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 137 -
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Paper • 2508.15144 • Published • 64 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 154
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper • 2105.09501 • Published -
Cross-modal Contrastive Learning for Speech Translation
Paper • 2205.02444 • Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper • 2210.03052 • Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper • 2212.10240 • Published • 1
-
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Paper • 2504.02821 • Published • 9 -
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
Paper • 2504.17343 • Published • 13 -
ViSMaP: Unsupervised Hour-long Video Summarisation by Meta-Prompting
Paper • 2504.15921 • Published • 7 -
Causal-Copilot: An Autonomous Causal Analysis Agent
Paper • 2504.13263 • Published • 7