Text Generation
Transformers
Safetensors
minimax_m2
conversational
custom_code
fp8

Expanding context window via YARN

#33
by sixxio - opened

Can i expand context window size for this model via YARN, modifying config.json like this?
Did anyone tried this approach?
Is there any info about context size in train dataset?
I've seen info about VRAM requirements for 3M context in deployment guides, but no info about YARN usage in readme or guide..
{
...,
"rope_scaling": {
"rope_type": "yarn",
"factor": 4.0,
"original_max_position_embeddings": 196608
}
}

Sign up or log in to comment