NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20 • 37
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 15
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning Paper • 2504.11409 • Published Apr 15 • 9
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Paper • 2410.01680 • Published Oct 2, 2024 • 35
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation Paper • 2410.21271 • Published Oct 28, 2024 • 7
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 45
RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models Paper • 2412.07679 • Published Dec 10, 2024
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge Paper • 2411.12915 • Published Nov 19, 2024
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 15
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning Paper • 2504.11409 • Published Apr 15 • 9
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17 • 94
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 57
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 15
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 45
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 25
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19, 2024 • 52