Request for an IQ5 Quant

#19
by binahz - opened

Hey, I know you this model is kinda old news now, but imo, its still one of the best for intelligence, longer context performance, and a nice writing style. Could you please create an IQ5 quant for this, similar to the ones you made for Deepseek v3.1, which will make it perfect for 768GB systems with 24GB VRAM? The only other one that comes close by anikifoss seems to have been quanted in such way where moving any of the full layers to GPU heavily hampers inference performance...

Sign up or log in to comment