Great Work!
Great works, as usual!
How do you find this model so far?
How do you find this model so far?
I haven't done too much with it other than basic testing. It seems like it can code okay, and rumors I hear are:
From the Openrouter discord, it sounds like it could be good for coding and STEM knowledge, but is slightly broken (possibly the model itself, but more likely some backend bug).
Looks like they are trying to fix it: https://huggingface.co/inclusionAI/Ling-1T/discussions/7#68efdf17ea21691631f93c54
I have run it up to around 40k context depth in 6 turns or so having it process a journal article and asking for pseudo code then finally a c++ diff patch which partially applied but of course wasn't right but it didn't have any "catastrophic" failures it always was cogent... But probably needs more testing...
It has a really low absolute perplexity, which suggests it has memoized english wiki.test.raw well i suppose maybe but kind of interesting - the Q8_0 is under 2.0 perplexity fwiw.
Had some discussion on discord about it too and also the related bailing_moe models like Ling-1T and the flash version: https://www.reddit.com/r/LocalLLaMA/comments/1o9g4if/comment/nk2ieak/
Curious to hear what others think about its coding and also role play abilities etc.
Oh check this out, its doing well on some benchmarks and working with coding agent stuff: https://huggingface.co/ubergarm2/Ling-1T-GGUF/discussions/1#68f311b69a34602b1a496f39
Thanks for sharing the bleeding edge updates, this model appears to be more than just hype! I'm excited to see if it can best DeepSeek at coding.
I'm hoping to release some larger ik specific quants soon and get some perplexity data and chart going soon. I'll likely keep the attn/first 4 dense/shexp full q8_0 for some of the bigger iq5_k type quants if I make that big boi.
I'll wait for your quants... not enough SSD space on my setup to efficiently deal with 2TB sized FP16 GGUFs.
The big one is almost finished uploading and perplexity data graph now up too. I know you often like no imatrix, but this particular one has imatrix data used for all tensors. But should at least give a good feel for the model!
And right, I'm almost out of disk space on a couple big RAID arrays already π
@ubergarm Thank you so much John! Without your help, I won't be able to run models like Ling 1T. I am running Ling 1T smol-IQ4_KSS (TG speed at 12t/s). I need to spend more time with the model, but so far it seems to be a very smart model.
I asked, "In Han Kang's "Vegetarian", how many narrators are there in the Novel?". Not many models get this right, but Ling gave me a correct answer.
I also asked, "Tell me about the temporal summation in the context of neuroscience and concentration.". When I ask this to Kimi K2, Kimi K2 answers my question as if I am a subject matter expert. When I ask this to DeepSeek, DeepSeek answers as if I know nothing about the subject. Ling's answer is in the middle. Ling's answer is somewhat technical, but not as extreme as Kimi K2. Kimi K2 is quirky, and that is why it's charming, but for reliable answers I think I will use DeepSeek or Ling.