Improved wording of the Swallow-Code-v0.3-Instruct-style dataset description.
Browse files
README.md
CHANGED
@@ -216,7 +216,7 @@ The following datasets were used for the instruction tuning.
|
|
216 |
- A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
|
217 |
- The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
|
218 |
- Swallow-Code-v0.3-Instruct-style
|
219 |
-
-
|
220 |
|
221 |
## Risks and Limitations
|
222 |
|
|
|
216 |
- A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions.
|
217 |
- The conversations were heuristically filtered for quality and length. Then, [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) was applied to score the quality of each of the conversation with a range of 1-10. Conversations with scores <= 7 were rejected.
|
218 |
- Swallow-Code-v0.3-Instruct-style
|
219 |
+
- A synthetic instruction dataset for code generation in English, restructured into an instruction-following format from Swallow Code v0.3 using [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct).
|
220 |
|
221 |
## Risks and Limitations
|
222 |
|