ubergarm
/

Ling-1T-GGUF

Text Generation

Model card Files Files and versions

when qwen_next (i just tested both model did well)

#10

by gopi87 - opened 8 days ago

gopi87

8 days ago

qwen_next is the best model even its better then gpt 120b impo

cuda+cpu

https://github.com/cturan/llama.cpp

Owner 7 days ago

It is close to being merged in mainline: https://github.com/ggml-org/llama.cpp/pull/16252

If it gets ported to ik then I might take a look!

Owner 7 days ago

•

edited 7 days ago

Seeing minimax-m2 coming out experimental for mainline too: https://huggingface.co/cturan/MiniMax-M2-GGUF https://github.com/ggml-org/llama.cpp/pull/16831

mtcl

6 days ago

I cannot wait for minimax m2!

By the way guys, I have a new toy to play with:
https://youtu.be/HliRC6qCkqk

I was able to compile llama.cop on it just fine but il_llama won't compile there.

Owner 6 days ago

@DevQuasar just released some mainline lcpp quants here: https://huggingface.co/DevQuasar/MiniMaxAI.MiniMax-M2-GGUF for testing pwilkin's PR linked above

oh interesting, i would think ik would compile for nvidia dgx spark might need to specify that 121 arch specifically, or what is the error you're getting?

# maybe something like this?
cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DGGML_VULKAN=OFF -DGGML_RPC=OFF -DGGML_BLAS=OFF -DCMAKE_CUDA_ARCHITECTURES="121"
cmake --build build --config Release -j $(nproc)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment