minimax-m1

#14
by cheaptoner2016 - opened

do you think you could convert the model minimax-m1

@cheaptoner2016

It does look like an interesting model coming in at ~456B with ~46B active. It would probably have slower token generation speeds than
DeepSeek-R1 (only ~37B active) but easier to fit all the quantized weights into RAM.

I don't think the architechture is supported on ik_llama.cpp yet, but huh is it even in llama.cpp yet?

Usually someone will port the transformers implementation for a new architechture into llama.cpp, and then someone else may try to port that to ik_llama.cpp.

But yeah I don't see any GGUFs of it yet, I figured this would be done already. I don't see a closed PR on llama.cpp either with the model added, do you know if there is support?

Sign up or log in to comment