SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
Abstract
SparseLoRA reduces computational cost and speeds up fine-tuning of LLMs by dynamically selecting a sparse subset of weights for loss and gradient computation.
Fine-tuning LLMs is both computationally and memory-intensive. While parameter-efficient fine-tuning methods, such as QLoRA and DoRA, reduce the number of trainable parameters and lower memory usage, they do not decrease computational cost. In some cases, they may even slow down fine-tuning. In this paper, we introduce SparseLoRA, a method that accelerates LLM fine-tuning through contextual sparsity. We propose a lightweight, training-free SVD sparsity estimator that dynamically selects a sparse subset of weights for loss and gradient computation. Also, we systematically analyze and address sensitivity across layers, tokens, and training steps. Our experimental results show that SparseLoRA reduces computational cost by up to 2.2 times and a measured speedup of up to 1.6 times while maintaining accuracy across various downstream tasks, including commonsense and arithmetic reasoning, code generation, and instruction following.
Community
This paper introduces SparseLoRA, a method that uses contextual sparsity to accelerate LLM fine-tuning, cutting compute by up to 2.2× and runtime by 1.6× while maintaining model accuracy on reasoning, coding, and instruction-following tasks.
Learn more at z-lab.ai/projects/sparselora
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- MLorc: Momentum Low-rank Compression for Large Language Model Adaptation (2025)
- Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution (2025)
- Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity (2025)
- UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models (2025)
- Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis (2025)
- DenseLoRA: Dense Low-Rank Adaptation of Large Language Models (2025)
- Parameter-Efficient Fine-Tuning with Column Space Projection (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper