John Leimgruber III's picture

John Leimgruber III

ubergarm

·

https://www.paypal.com/donate/?hosted_button_id=HU59345BZVSUA

AI & ML interests

Open LLMs and Astrophotography image processing.

Recent Activity

updated a model 14 minutes ago

ubergarm/DeepSeek-R1T-Chimera-GGUF

reacted to bartowski's post with 👍 about 16 hours ago

Was going to post this on /r/LocalLLaMa, but apparently it's without moderation at this time :') https://huggingface.co/bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF Was able to use previous mistral chat templates, some hints from Qwen templates, and Claude to piece together a seemingly working chat template, tested it with llama.cpp server and got perfect results, though lmstudio still seems to be struggling for some reason (don't know how to specify a jinja file there) Outlined the details of the script and results in my llama.cpp PR to add the jinja template: https://github.com/ggml-org/llama.cpp/pull/14349 Start server with a command like this: ``` ./llama-server -m /models/mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M.gguf --jinja --chat-template-file /models/Mistral-Small-3.2-24B-Instruct-2506.jinja ``` and it should be perfect! Hoping it'll work for ALL tools if lmstudio gets an update or something, not just llama.cpp, but very happy to see it works flawlessly in llama.cpp In the meantime, will try to open a PR to minja to make the strftime work, but no promises :)

liked a model about 16 hours ago

ddh0/tensor-type-testing

View all activity

Organizations

None yet

upvoted a collection 16 days ago

YAQA

YAQA hessians (Sketch B) and models with the QTIP quantizer. See https://github.com/Cornell-RelaxML/yaqa/tree/main for more details. • 9 items • Updated 19 days ago • 2

upvoted a collection about 1 month ago

EXL3 models

23 items • Updated 12 days ago • 26

upvoted a collection about 2 months ago

Qwen3

72 items • Updated 11 days ago • 802

upvoted 3 collections 2 months ago

SkyReels-V2

Infinite-length Film Generative Model • 17 items • Updated 12 days ago • 44

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 27 days ago • 201

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated Apr 15 • 127

upvoted an article 2 months ago

Article

Introduction to ggml

By

and 2 others •

Aug 13, 2024

• 213

upvoted an article 3 months ago

Article

Comparing sub 50GB Llama 4 Scout quants (KLD/Top P)

By

•

Apr 9

• 41

upvoted a collection 4 months ago

FP8 LLMs for vLLM

Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 44 items • Updated Oct 17, 2024 • 74

upvoted 2 articles 5 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

By

and 2 others •

Jan 28

• 868

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

By

•

Jan 20

• 68

upvoted 2 collections 5 months ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Apr 28 • 119

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 496

upvoted 3 collections 9 months ago

Llama 3.2 3B & 1B GGUF Quants

Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated Sep 26, 2024 • 46

Llama 3.1 GPTQ, AWQ, and BNB Quants

Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Sep 26, 2024 • 56

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Apr 28 • 219

upvoted a collection about 1 year ago

abliterated-v3

Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3, 2024 • 123