HuggingFaceFW/fineweb-edu
			Viewer
			• 
	
				Updated
					
				• 
			
			3.5B
	
				• 
					
					297k
				
				• 
					
					790
				
datasets used in SmolLM3 pretraining
Note Stage 1 datasets: 85% Web, 12% Code, 3% Math
Note Stage2 new datasets
Note Stage 3 (decay) new datasets