Fix incorrect vocab_size in Qwen3-8B config.json
#25
by
Parveshiiii
- opened
While fine-tuning Qwen3-8B, I encountered a mismatch between the vocab_size specified in config.json (151936) and the actual tokenizer size (151669) reported by Qwen2TokenizerFast. This discrepancy can lead to shape mismatch errors when resizing the embedding layer or loading the model for training.
This PR updates the vocab_size field in config.json to reflect the correct tokenizer size of 151669, ensuring consistency across model weights, tokenizer, and configuration.
No architectural changes were made β only the vocabulary alignment was corrected.