tokyotech-llm
/

Llama-3.3-Swallow-70B-Instruct-v0.4

@@ -216,7 +216,7 @@ The following datasets were used for the instruction tuning.
   - A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
   - The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
 - Swallow-Code-v0.3-Instruct-style
-  - Single-turn English and code synthetic instruction dataset restructured into instruct-style from Swallow Code v0.3 with [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct).
 ## Risks and Limitations

   - A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
   - The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
 - Swallow-Code-v0.3-Instruct-style
+  - A synthetic instruction dataset for code generation in English, restructured into an instruction-following format from Swallow Code v0.3 using [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct).
 ## Risks and Limitations