cloned transfomer
created with stacking layers, trained on Habr+Rulm
| dataset |
rugpt 760m large |
AlexWortega/ruClonedGPT_1.4B |
| xnliru |
0.34 |
0.36 |
| xwinograd |
0.65 |
0.68 |
| danetqa |
0.62 |
0.65 |
| muserc |
0.72 |
0.74 |
| parus |
0.584 |
0.61 |
| rcb |
0.417 |
0.45 |
| rucos |
0.21 |
0.25 |
| russe |
0.647 |
0.66 |
| ruterra |
0.654 |
0.67 |
| rwsd |
0.636 |
0.339 |