BAAI
/

OpenSeek-Small-v1-Baseline

Model card Files Files and versions Community

ldwang commited on 21 days ago

Commit

f3dfd70

·

verified ·

1 Parent(s): 30b43c7

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -4,7 +4,8 @@
 We sampled 100 billion tokens from the CCI4.0 dataset and trained a 1.4B-parameter MoE model with 0.4B active parameters. This model, along with the dataset, is open-sourced as a baseline for future experiments in areas such as dataset construction, algorithmic strategies, and parallel training frameworks. The model arch is same as OpenSeek-Small-v1 model.
 ## Training Data
-**Total Volume**: 100B high-quality pretraining data
 | Name                                      | Ratio   |
 |-------------------------------------------|---------|
 | Nemotron-CC-high-actual-actual-high       | 1.1068  |

 We sampled 100 billion tokens from the CCI4.0 dataset and trained a 1.4B-parameter MoE model with 0.4B active parameters. This model, along with the dataset, is open-sourced as a baseline for future experiments in areas such as dataset construction, algorithmic strategies, and parallel training frameworks. The model arch is same as OpenSeek-Small-v1 model.
 ## Training Data
+The ratio for each domain is as follows:
 | Name                                      | Ratio   |
 |-------------------------------------------|---------|
 | Nemotron-CC-high-actual-actual-high       | 1.1068  |