ldwang commited on
Commit
f3dfd70
·
verified ·
1 Parent(s): 30b43c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -4,7 +4,8 @@
4
  We sampled 100 billion tokens from the CCI4.0 dataset and trained a 1.4B-parameter MoE model with 0.4B active parameters. This model, along with the dataset, is open-sourced as a baseline for future experiments in areas such as dataset construction, algorithmic strategies, and parallel training frameworks. The model arch is same as OpenSeek-Small-v1 model.
5
 
6
  ## Training Data
7
- **Total Volume**: 100B high-quality pretraining data
 
8
  | Name | Ratio |
9
  |-------------------------------------------|---------|
10
  | Nemotron-CC-high-actual-actual-high | 1.1068 |
 
4
  We sampled 100 billion tokens from the CCI4.0 dataset and trained a 1.4B-parameter MoE model with 0.4B active parameters. This model, along with the dataset, is open-sourced as a baseline for future experiments in areas such as dataset construction, algorithmic strategies, and parallel training frameworks. The model arch is same as OpenSeek-Small-v1 model.
5
 
6
  ## Training Data
7
+ The ratio for each domain is as follows:
8
+
9
  | Name | Ratio |
10
  |-------------------------------------------|---------|
11
  | Nemotron-CC-high-actual-actual-high | 1.1068 |