Text Generation
Transformers
Safetensors
minimax
conversational
fp8

GGUF support

#17
by geboh67859 - opened

GGUF format will make your great work accessible to more users!

the mainline llama.cpp PR is here: https://github.com/ggml-org/llama.cpp/pull/16831

I got @DevQuasar Q8_0 quant working with the above PR with command to run here on the GGUF repos discussions: https://huggingface.co/DevQuasar/MiniMaxAI.MiniMax-M2-GGUF/discussions/1

Seems to be working okay, though be mindful of the unique interleaved thinking tag chat thread.

Sign up or log in to comment