minimax-m1
#14
by
cheaptoner2016
- opened
do you think you could convert the model minimax-m1
It does look like an interesting model coming in at ~456B with ~46B active. It would probably have slower token generation speeds than
DeepSeek-R1 (only ~37B active) but easier to fit all the quantized weights into RAM.
I don't think the architechture is supported on ik_llama.cpp yet, but huh is it even in llama.cpp yet?
Usually someone will port the transformers implementation for a new architechture into llama.cpp, and then someone else may try to port that to ik_llama.cpp.
But yeah I don't see any GGUFs of it yet, I figured this would be done already. I don't see a closed PR on llama.cpp either with the model added, do you know if there is support?