mtp or other speculative decoding method?
#34 opened about 10 hours ago
by
CHNtentes
Expanding context window via YARN
#33 opened about 19 hours ago
by
sixxio
Fix chat_template
#32 opened 1 day ago
by
rogeryoungh
MiniMax-M2全方位最新实测出炉(300+维度),欢迎进群交流讨论~
#31 opened 1 day ago
by
JEIN
Can it run on NVIDIA A100 80G * 8 ?
1
#30 opened 3 days ago
by
jetto98
Interleaved Thinking, minimax:tool_call parsing
1
#29 opened 3 days ago
by
0xSero
mixed results
2
#28 opened 3 days ago
by
kingriel
Missing official chat_template / unclear initialization of <think> interleaving in MiniMax-M2 ?
#27 opened 4 days ago
by
Serveurperso
Request: DOI
#26 opened 5 days ago
by
sreeshanthpeddi
Request for 2000 Samples from training data for NVFP4 QUANTIZATION
#25 opened 5 days ago
by
jasonface
Prepare support transformers
2
#24 opened 5 days ago
by
rogeryoungh
what data and its volume were used to train the model?
👍
1
4
#21 opened 8 days ago
by
Pep0pi
230B vs 235B: Why no comparison against Qwen3-235B-A22B-Thinking-2507 ?
🤝
👍
2
4
#20 opened 8 days ago
by
rtzurtz
AWQ Please
2
#18 opened 9 days ago
by
darkstar3537
GGUF support
👍
➕
7
2
#17 opened 9 days ago
by
geboh67859
Why does it keep trying to connect to huggingface?
1
#16 opened 9 days ago
by
surak
Was the training done with FP8 or BF16?
1
#14 opened 9 days ago
by
mindkrypted
About the LCB evaluation
➕
2
2
#13 opened 9 days ago
by
sayhitoday
YES!!
🚀
14
1
#12 opened 9 days ago
by
CyborgPaloma
When will transformers support Minimax-M2?
👍
👀
7
1
#11 opened 9 days ago
by
zx-modelcloud
Speculative decoding
👍
1
#9 opened 9 days ago
by
adsfdgfhgjhk11
No lightning attention?
3
#8 opened 9 days ago
by
djuna