AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Swallow LLM
Research and development of large language models conducted by the members mainly in Okazaki Laboratory and Yokota Laboratory at Institute of Science Tokyo (formerly known as Tokyo Institute of Technology)
- From Okazaki Laboratory, Institute of Science Tokyo, the following members:
- Naoaki Okazaki
- Sakae Mizuki
- Youmi Ma
- Sangwhan Moon
- Koki Maeda
- Masanari Ohi
- Hinari Shimada
- Taihei Shiotani
- Koshiro Saito
- Tatsuya Ichinose
- Naoya Matsushita
- Sora Miyamoto
- Nguyen Tien Dung
- Yuta Katayama
- From YOKOTA Laboratory, Institute of Science Tokyo, the following members:
- Rio Yokota
- Kazuki Fujii
- Taishi Nakamura
- Takumi Okamoto
- Ishida Shigeki
- Masaki Kawamura
- Yukito Tajima
- From Artificial Intelligence Research Center, AIST, Japan, the following members:
Collections
12
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
-
tokyotech-llm/swallow-math
Viewer • Updated • 4.33M • 1.95k • 26 -
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500
Updated • 9 -
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000
Updated • 8 -
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500
Updated • 9
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
-
tokyotech-llm/swallow-math
Viewer • Updated • 4.33M • 1.95k • 26 -
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500
Updated • 9 -
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000
Updated • 8 -
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500
Updated • 9
models
116

tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5
Text Generation
•
Updated
•
35
•
3

tokyotech-llm/Llama-3.1-Swallow-8B-v0.5
Updated
•
120
•
2

tokyotech-llm/Llama-3.3-Swallow-70B-v0.4
Text Generation
•
Updated
•
1.95k
•
3

tokyotech-llm/Gemma-2-Llama-Swallow-27b-it-v0.1
Text Generation
•
Updated
•
1.94k
•
1

tokyotech-llm/Gemma-2-Llama-Swallow-9b-it-v0.1
Text Generation
•
Updated
•
2.48k
•
2

tokyotech-llm/Gemma-2-Llama-Swallow-2b-it-v0.1
Text Generation
•
Updated
•
4.98k
•
2

tokyotech-llm/Gemma-2-Llama-Swallow-2b-pt-v0.1
Text Generation
•
Updated
•
4.77k

tokyotech-llm/Gemma-2-Llama-Swallow-27b-pt-v0.1
Text Generation
•
Updated
•
118

tokyotech-llm/Gemma-2-Llama-Swallow-9b-pt-v0.1
Text Generation
•
Updated
•
171
•
1

tokyotech-llm/Llama-3.1-8B-code-ablation-exp3-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0012500
Updated
•
54
•
1
datasets
8
tokyotech-llm/lmsys-chat-1m-synth
Updated
•
1.44k
•
10
tokyotech-llm/dclm-baseline
Preview
•
Updated
•
305
tokyotech-llm/swallow-code
Viewer
•
Updated
•
129M
•
3.25k
•
43
tokyotech-llm/swallow-math
Viewer
•
Updated
•
4.33M
•
1.95k
•
26
tokyotech-llm/swallow-magpie-ultra-v0.1
Viewer
•
Updated
•
85.1k
•
190
•
3
tokyotech-llm/swallow-gemma-magpie-v0.1
Viewer
•
Updated
•
132k
•
114
•
1
tokyotech-llm/swallow-code-v0.1
Updated
•
60
•
1
tokyotech-llm/Swallow-Instruct-v0.1
Viewer
•
Updated
•
47.5k
•
98
•
10