(solved non-issue, City96 GGUF) llama-cpp-python compatibility?

#3
by gnivler - opened
This comment has been hidden (marked as Off-Topic)
QuantStack org

The model is intended to be used with comfyui, i dont even know if there is a llama.cpp fork with support?

Yeah, I've got it in ComfyUI, troubleshooting horrid inference speed and low GPU (like it's on CPU but neither saturated), so I tried the CLI to verify CUDA support in the library, seems to check out.

I am possibly mistakenly assuming llama-cpp-python is used under the hood with all the GGUF model loaders?

edit - yep, it's not llama-cpp-python, totally dead end red herring. I just looked in the Comfy-GGUF module it's using https://github.com/city96/ComfyUI-GGUF

thanks for the reply

gnivler changed discussion title from llama-cpp-python compatibility? to (solved non-issue, City96 GGUF) llama-cpp-python compatibility?

Sign up or log in to comment