Expanding context window via YARN

#33

by sixxio - opened 3 days ago

3 days ago

Can i expand context window size for this model via YARN, modifying config.json like this?
Did anyone tried this approach?
Is there any info about context size in train dataset?
I've seen info about VRAM requirements for 3M context in deployment guides, but no info about YARN usage in readme or guide..
{
...,
"rope_scaling": {
"rope_type": "yarn",
"factor": 4.0,
"original_max_position_embeddings": 196608
}
}

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment