s-mizuki-nlp commited on
Commit
8ce855e
·
verified ·
1 Parent(s): 594b59d

Improved wording of the Swallow-Code-v0.3-Instruct-style dataset description.

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -216,7 +216,7 @@ The following datasets were used for the instruction tuning.
216
  - A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
217
  - The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
218
  - Swallow-Code-v0.3-Instruct-style
219
- - Single-turn English and code synthetic instruction dataset restructured into instruct-style from Swallow Code v0.3 with [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct).
220
 
221
  ## Risks and Limitations
222
 
 
216
  - A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
217
  - The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
218
  - Swallow-Code-v0.3-Instruct-style
219
+ - A synthetic instruction dataset for code generation in English, restructured into an instruction-following format from Swallow Code v0.3 using [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct).
220
 
221
  ## Risks and Limitations
222