ff670 commited on
Commit
c87501f
·
verified ·
1 Parent(s): 97c2ca1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -4,9 +4,13 @@ base_model:
4
  - Qwen/Qwen3-32B
5
  ---
6
 
7
- The missing "base model" of Qwen3-32B. This model is not intended for direct inference.
8
 
9
  This model is the result of continued pre-training on Qwen3-32B, using a multilingual dataset of mixed code and text.
10
 
11
- The purpose of training this model is to provide a model that is close to a "pre-trained" state, reducing the influence of the original Qwen3's linguistic style on subsequent fine-tuning efforts. This model serves as the foundation for our R1-0528 distillation work.
 
 
 
 
12
 
 
4
  - Qwen/Qwen3-32B
5
  ---
6
 
7
+ The missing "base model" of Qwen3-32B. This model serves as the foundation for our R1-0528 distillation work.
8
 
9
  This model is the result of continued pre-training on Qwen3-32B, using a multilingual dataset of mixed code and text.
10
 
11
+ The purpose of training this model is to provide a model that is close to a "pre-trained" state, reducing the influence of the original Qwen3's linguistic style on subsequent fine-tuning efforts.
12
+
13
+ We are providing this model to the community to serve as a base model for further SFT, this model is not intended for direct inference.
14
+
15
+
16