Jack Li
commited on
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_1.0B-D_19.0B](https://huggingface.co/collec
|
|
| 27 |
|
| 28 |
### Training Parameters
|
| 29 |
- **Learning rate (lr)**: 3.453e-04
|
| 30 |
-
- **Batch size (bs)**:
|
| 31 |
- **Training iterations**: 305175
|
| 32 |
- **Training tokens (D)**: 20.0B
|
| 33 |
|
| 34 |
## Model Description
|
| 35 |
|
| 36 |
-
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 3.453e-04 and batch size
|
| 37 |
|
| 38 |
## Usage Example
|
| 39 |
|
|
|
|
| 27 |
|
| 28 |
### Training Parameters
|
| 29 |
- **Learning rate (lr)**: 3.453e-04
|
| 30 |
+
- **Batch size (bs)**: 65536
|
| 31 |
- **Training iterations**: 305175
|
| 32 |
- **Training tokens (D)**: 20.0B
|
| 33 |
|
| 34 |
## Model Description
|
| 35 |
|
| 36 |
+
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 3.453e-04 and batch size 65536 for 305175 iterations, using a total of 20.0B training tokens.
|
| 37 |
|
| 38 |
## Usage Example
|
| 39 |
|