Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.26697

Chat with truly end-to-end LLMs with AutoDeco heads

zacks917/AutoDeco-Llama-Nemotron-8B

Updated 9 days ago • 40
zacks917/AutoDeco-R1-Distill-Qwen-7B

1.84M • Updated 9 days ago • 22 • 1
zacks917/AutoDeco-Qwen3-30B-A3B-Instruct-2507

1.05M • Updated 9 days ago • 27 • 1
zacks917/AutoDeco-Qwen3-235B-A22B-Thinking-2507

Updated 9 days ago • 18

about 7 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 473
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 529
LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published 23 days ago • 108
The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published 14 days ago • 113

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 473
SpikingBrain Technical Report: Spiking Brain-inspired Large Models

Paper • 2509.05276 • Published Sep 5 • 3
Self-Adapting Language Models

Paper • 2506.10943 • Published Jun 12 • 6
The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published 29 days ago • 30

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31 • 1

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 167 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published 14 days ago • 24
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Paper • 2510.25992 • Published 15 days ago • 41
The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published 14 days ago • 113

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 473
Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3 • 96
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published Oct 8 • 48
StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published Oct 10 • 50

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 18
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 61
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 53

Chat with truly end-to-end LLMs with AutoDeco heads

zacks917/AutoDeco-Llama-Nemotron-8B

Updated 9 days ago • 40
zacks917/AutoDeco-R1-Distill-Qwen-7B

1.84M • Updated 9 days ago • 22 • 1
zacks917/AutoDeco-Qwen3-30B-A3B-Instruct-2507

1.05M • Updated 9 days ago • 27 • 1
zacks917/AutoDeco-Qwen3-235B-A22B-Thinking-2507

Updated 9 days ago • 18

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 167 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

about 7 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published 14 days ago • 24
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Paper • 2510.25992 • Published 15 days ago • 41
The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published 14 days ago • 113

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 473
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 529
LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published 23 days ago • 108
The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published 14 days ago • 113

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 473
Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3 • 96
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published Oct 8 • 48
StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published Oct 10 • 50

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 473
SpikingBrain Technical Report: Spiking Brain-inspired Large Models

Paper • 2509.05276 • Published Sep 5 • 3
Self-Adapting Language Models

Paper • 2506.10943 • Published Jun 12 • 6
The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published 29 days ago • 30

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 18
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31 • 1

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 61
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 53

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs