Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,8 @@
|
|
4 |
We sampled 100 billion tokens from the CCI4.0 dataset and trained a 1.4B-parameter MoE model with 0.4B active parameters. This model, along with the dataset, is open-sourced as a baseline for future experiments in areas such as dataset construction, algorithmic strategies, and parallel training frameworks. The model arch is same as OpenSeek-Small-v1 model.
|
5 |
|
6 |
## Training Data
|
7 |
-
|
|
|
8 |
| Name | Ratio |
|
9 |
|-------------------------------------------|---------|
|
10 |
| Nemotron-CC-high-actual-actual-high | 1.1068 |
|
|
|
4 |
We sampled 100 billion tokens from the CCI4.0 dataset and trained a 1.4B-parameter MoE model with 0.4B active parameters. This model, along with the dataset, is open-sourced as a baseline for future experiments in areas such as dataset construction, algorithmic strategies, and parallel training frameworks. The model arch is same as OpenSeek-Small-v1 model.
|
5 |
|
6 |
## Training Data
|
7 |
+
The ratio for each domain is as follows:
|
8 |
+
|
9 |
| Name | Ratio |
|
10 |
|-------------------------------------------|---------|
|
11 |
| Nemotron-CC-high-actual-actual-high | 1.1068 |
|