“The doom lies in yourself, not in your name.”
Continuation of Wur doomed!.
For longer text chunks or stories, https://pastebin.com works great and helps prevent the thread from slowing down!
🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧🟧
🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛🟧
🟧🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧🟧
⬜🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧⬛🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧⬛⬛⬛⬛🟧⬛⬛⬛⬛🟧🟧⬛⬛⬛🟧⬛⬛⬛🟧🟧⬛⬛⬛⬛🟧⬛⬛⬛🟧🟧🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧⬛⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛⬛⬛⬛⬛🟧⬛⬛⬛⬛⬛⬛⬛⬛🟧🟧🟧⬛⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛⬛⬛🟧🟧⬜🟧🟧⬛⬛⬛⬛⬛⬛🟧🟧🟧⬛⬛⬛⬛⬛⬛🟧🟧⬜🟧⬛⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛⬛🟧🟧⬜⬜⬜🟧🟧⬛⬛⬛⬛🟧🟧⬜🟧🟧⬛⬛⬛⬛🟧🟧⬜⬜🟧🟧⬛🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜🟧🟧⬛⬛🟧🟧⬜⬜⬜🟧🟧⬛⬛🟧🟧⬜⬜⬜⬜🟧🟧🟧⬜🟧⬛⬛⬛🟧⬜
⬜🟧⬛⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜🟧🟧🟧🟧⬜⬜⬜⬜⬜🟧🟧🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧⬛⬛🟧⬜
⬜🟧⬛⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧⬛⬛🟧⬜
⬜🟧⬛⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧⬛🟧⬜
⬜🟧⬛🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧⬛🟧⬜
⬜🟧🟧🟧⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜⬜🟧🟧🟧⬜
The doom is still buried within Command-A for sure.
A step 601 preview - all with temperature = 0:
- It's still messing up some end of lines, but I can live with that if it works... Likely can be fixed later using the new
class 0random data if a problem. - The Grimdark story was noticeably (much!) better compared to the inverse.
- The Battlestar Galactica story showed that even though
Q8_0,F16andBF16all diverge slightly fromF32; it's not clearly making them any worse (I actually liked theQ8_0story best!).
| Size | Name |
|---|---|
| 287M | command-a-03-2025-lora-Q8_0.ggu |
| 541M | command-a-03-2025-lora-F16.gguf |
| 541M | command-a-03-2025-lora-BF16.gguf |
| 1.1G | command-a-03-2025-lora-F32.gguf |
It still has a way to go before it starts to converge, but I would think by step 1000 it will be pretty close:
566 responses in previous thread! In the future we may be the reason for hf staff to implement multi-page view of discussions.
This was posted on Hacker News today:
Absolutely fascinating!
This was posted on Hacker News today:
Absolutely fascinating!
That was really cool. Thanks for sharing!
This was posted on Hacker News today:
Absolutely fascinating!
That was really cool. Thanks for sharing!
Yeah, and llama-3.1:405b doing so well was quite a surprise too (and makes you a bit sad everything seems to be moving away from large dense models ).
Oh they didn't tokenize think tags? That's... weird. I'm surprised I didn't come across mangled think tags yet. Thankfully, they tokenized the rest of ChatML.
I haven't tested it a lot yet, but does the CoT get worse after multiple turns?
Not sure I got the question right, it was presented as a quote. I never keep the CoT in my chat history, so it shouldn't have reasons to. If you're asking if the CoT gets longer or more deteriorated in long context, that's actually an interesting theory, I didn't think of that. I can't really confirm, but I feel there's probably some truth to it. It was more compact when I ran the model initially on a blank assistant, yeah. Most of my attempts were in the 16-20K context range, within existing long form scenarios and dialogs I keep on the side for occasions like this. I blamed it on the fact it was rp/creative stuff, and not on the length itself. I'll have to check.
That said my main concern is with open-ended narrative, when the user role doesn't ask for a specific action, but for something like "oh please, (charname), you can surely do better than that". In my, still limited experience with the model (I'll look at it in more depth in the coming days), that was a sure fire way to have 4K worth of indecisive CoT: "I could do this, but I could also do this, and what about that system message from 20 messages ago, oh wait, I think it's resolve, maybe them I could do this. But wait... Oh, but he asked for "better than that", is it really better? and what about my system prompt?" but stretched over paragraphs.
Also, anyone has any idea what's that "environment" role in their jinja template? First time I see that.
Most of my attempts were in the 16-20K context range
According to the config, this is an interleaved sliding window attention model (like Gemma), stretched to long context with YaRN.
…So yeah, that’s a huge risk of falling apart in the 16K+ range.
https://habr.com/en/companies/sberdevices/articles/968904/
Info from GigaChat devs:
Для Ultra мы существенно обновили Pretrain-корпус: он вырос до 14 триллионов токенов. Мы добавили 10 языков — от китайского и арабского до узбекского и казахского, — а также расширили набор источников: книги, академические данные, датасеты по коду и математике. Все данные проходят дедупликацию, языковую фильтрацию и автоматические проверки качества при помощи эвристик и классификаторов.
Ключевой вклад в качество дала синтетика: мы сгенерировали около 5,5 триллионов токенов синтетических данных. В корпус входят вопросы-ответы к текстам, цепочки reverse-prompt для структурирования данных, LLM-заметки с комментариями от модели внутри текстов, миллионы синтетических задач с решениями по математике и олимпиадному программированию (с синтетическими тестами) на основе PromptCot. Генерация работает на высокопроизводительных конвейерах (пайплайнах) с полным потреблением GPU — без этого такие объёмы были бы невозможны. Во внутренних экспериментах синтетические данные значительно увеличили все значимые метрики, такие как MMLU и MMLU-PRO.
Out of 14T tokens 5.5T are synthetic, almost 40%!
Улучшенный chat-template
Мы представляем обновлённый шаблон диалога с ключевыми улучшениями:Иерархия ролей. Наш новый шаблон использует систему ролей для управления диалогом. Роли расположены в порядке убывания приоритета: инструкции из расположенных выше в списке ролей имеют больший вес.
Весь список
Список ролей:developer system (инструкция разработчика): задаёт глобальные, непререкаемые правила (описание ролей, политика общения и вызова функций), имеет наивысший приоритет.
system (системная инструкция): описывает конкретную задачу или стиль ответа для данного диалога.
user (пользовательский запрос): запрос или сообщение от пользователя.
function descriptions (описания функций): доступные инструменты, описанные в формате TypeScript.
function result (результат работы функции): ответ, полученный после вызова инструмента.
user memory (память пользователя): актуальные долгосрочные факты о пользователе в формате JSON.
added files (добавленные файлы): метаданные файлов, доступных для работы в диалоге.
Роли ассистента:
assistant (ответ ассистента): ответы модели пользователю.
function call (вызов функции): решение модели вызвать инструмент и аргументы для этого.
They added a whole bunch of new tags besides user and assistant, but how well trained are those?
https://github.com/aomukai/Writingway2
Saw this posted on Reddit today. Hopefully less bloat than the first version!
Got very bad vibes from GigaChat very quickly. It started out nice, but the quality went to shit within 5 messages. It did not follow the scenario and started repeating itself. Increasing DRY did not help.
Traditional slop being "traditional slop-style phrases" meaning "eyes welling up", tappestries, shivers/whispers.
Big-Picture AI patterns being Not-X Y, Parroting, etc.
Is that right?
It turns out I was wrong, Olmo 3 might also have issues with slop. I call this the "Bustling Village" test. You see what happens with only the prompt "One day, in the bustling village of".
OLMo 2 32B @ 6T tokens pre-training (stage 1)
One day, in the bustling village of Upton-On-Severn, a young girl caught sight of his eyes, and instantly fell madly in love with him. He reciprocated her feelings, but alas, her family did not approve of the match; a farmer's boy was no match for their daughter, and so they banned her from going anywhere near the handsome man with the strange eyes.Heartbroken, but unwilling to be parted from each other, he found her one evening, and they met on the green. "I do love you," he said, "and if I must, I will marry you and we will live out our lives in poverty together, but if you cannot face this life, then we shall end it for both of us, together."
He drew a small knife from his pocket, and she reached towards it, her mind not made up. As her hand strayed towards the blade, he dropped the knife. "The Gods have spoken," he said "we cannot be joined."
Heartbroken, he returned to his farm, and sat and stared at his hands, trying to comprehend what had happened, unable to. As he did so, his thumb began to lengthen, to elongate until it was longer than the other fingers, and then it withered and blackened and turned to stone, just as his heart had done. His despair overcame him, and he died, there in his chair.
There, upon his farm, his body remained until his family buried him, and they never noticed that his hands were any different to normal; and they never noticed that the plants watered with the water from the well had grown just a little more green, and a little more rich.
When next the girl came, mourning for the man she might have loved, and drew water from the well, she could not know what it was that happened next. The fingers reached up through the sodden earth and touched the wellhead, and when she drew the water, her fingers touched his and her eyes went wide.
And then, from her house, a scream so fell and full of despair was heard, and everyone rushed to see what had happened, but all they found was an empty bucket, and a withered corpse by the well.
Now, to this very day, in a village just a short drive from where you are, in the very shadow of the Severn Bridge itself, there lies an old well in the cellar of a house; and every so often, people go missing, and their bodies are found, bloated and drowned; and even after they are buried, occasionally they are dug up, just to make sure, and found to be missing; and some folk talk of the ghost of the well, but you can tell them, for you have been told the truth of the matter now, that they weren't drowned, but that their lips met those of the boy with the long, stony fingers, and their lips were kissed away, and their life-essence sucked out of them. And if you visit that village, and Uley Court as it is known these days, keep your wits about you, and do not look into that well, for his fingers are still there, and he is so very hungry for love.
Coutesy of the creative mind of Telenia (c) 2004
The Boy with the Long Stone Fingers
Olmo 3 32B @ 5T tokens pre-training (stage 1)
One day, in the bustling village of Eldergrove, the residents were troubled by a series of thefts. The thief remained unknown, leaving no clues behind. But all that changed when a mysterious figure appeared in the village square. It was Toma, an enigmatic young woman with a haunted past. She possessed the extraordinary ability to read people's thoughts and emotions, a talent passed down through her family for centuries.As the villagers gathered around, Toma approached the stolen items, holding them delicately in her hands. Closing her eyes, she chanted ancient incantations, tapping into the power of her lineage. Instantly, a vision came to her mind, revealing the identity of the thief. The shock in the crowd was palpable as they witnessed Toma naming the thief, who immediately confessed his crime.
Word of Toma's uncanny ability quickly spread throughout the land, and soon she was sought out by kings and queens, noble families, and the common folk alike. Toma's reputation grew, and she became a respected figure known for her unique gift.
In her journey, Toma met a kind-hearted blacksmith named Eli. They formed a deep bond, united by their shared desire to help those in need. Together, they used Toma's powers to uncover injustices, solve mysteries, and restore balance to the world.
But Toma's powers came with a great price. Each time she delved into someone's mind, she carried a part of their emotions and memories with her. Over time, this began to weigh heavily on her spirit, draining her energy and causing turmoil within herself.
Determined to protect those around her, Toma embarked on a quest to find an ancient artifact, said to hold the key to unlocking the full potential of her powers. Guided by cryptic clues and legends, Toma and Eli ventured into treacherous territories, encountering mythical creatures and facing formidable challenges.
As they journeyed, Toma discovered that her powers were not simply a gift but a responsibility. She learned to control them, filtering out unnecessary emotions and focusing on her mission. Through perseverance and sacrifice, she and Eli found the artifact, an amulet that enhanced Toma's abilities.
With newfound strength, Toma continued her work, using her powers to bring justice and peace to a troubled land. She became a symbol of hope, inspiring others to harness their own hidden talents and stand up against injustice.
Toma and Eli's adventures took them to far-off lands, where they encountered individuals who possessed extraordinary gifts of their own. They formed a group known as "The Gifted" and dedicated themselves to protecting the innocent and fighting for the greater good.
Together, they faced powerful sorcerers, unraveled ancient curses, and battled against dark forces that threatened the realm. With Toma's unique ability to read minds and their combined strengths, they became unstoppable.
As the years passed, Toma and Eli's bond grew stronger, and their love blossomed. They became a beacon of light, their presence instilling faith in the hearts of the people. Toma's powers evolved further, allowing her to see beyond what was visible, to understand the deeper truths that lay beneath the surface.
In their final battle against a malevolent sorcerer seeking to unleash chaos upon the world, Toma and her companions fought with all their might. The clash of powers shook the land, and the fate of the realm hung in the balance.
In a moment of triumph, Toma tapped into her full potential, harnessing the strength of her ancestors and the power of the amulet. With a surge of energy, she banished the sorcerer and his darkness, ensuring peace and harmony for the people.
But Toma's victory came at a great cost. The amulet shattered, rendering her powers dormant. The group mourned the loss, but they knew that Toma's legacy would live on through their memories and the impact she had made.
Eli, with unwavering love and dedication, became Toma's anchor, guiding her through the challenges of life without her extraordinary powers. Together, they retired to a peaceful village, where they lived out their days as respected elders, sharing the wisdom they had gained from their adventures.
Toma's story became a legend, passed down from generation to generation. Her name was whispered with reverence, reminding people of the transformative power of compassion, bravery, and the discovery of one's true potential.
Though Toma may have lost her supernatural abilities, her journey had forever changed her. She had come to understand that her strength lay not in the extraordinary but in the ordinary acts of kindness, resilience, and the power of the human spirit.
And so, the tale of Toma and her adventures with the Gifted continues to inspire new generations, reminding us all that within each of us lies a unique gift waiting to be discovered and wielded for the greater good.
You run this query many times to see what happens. I usually use 0.01 min-p, 0.99 top-p, "warmer than usual" temperature, like 1.1. OLMo 2 32B Base (at 6T tokens, first stage of pre-training) is quite unslopped. Olmo 3 32B Base (at 5T tokens, first stage of pre-training) is unfortunately riddled with AI-slop language.




