🏗️ Building on HF

PhysiQuanty PRO

PhysiQuanty

·

AI & ML interests

Theoretical Physics, Invariant Tokenization, Standard Model of Particle Physics Applied ML 🇫🇷

Recent Activity

upvoted an article 38 minutes ago

🔁 Teaching a 15M French LLM to think deeper — and to know when to stop 🇫🇷

reacted to RDTvlokip's post with 🚀 40 minutes ago

I finally changed the architecture of my 15M French LLM. It worked. Then I almost fooled myself about how much and catching that was the real win. After proving last time that architecture is a threshold, not a lever, I got stubborn: could I change how the model learns? Four honest attempts, Lion, a sharper AdamW β2, multi-token prediction, LayerScale. Four failures. The bottleneck wasn't the learning rule either. So I changed the shape of the computation instead: loop the same transformer blocks 4×, deeper reasoning, zero added parameters. It beat the baseline on perplexity, the first thing in the whole project to move that number. Then I added my own twist: let each token decide how deep to think, halting on its own entropy. My first evaluation was spectacular. Coherence up 65%. Hallucinated names down 62%. It was noise. Eight prompts, one seed. I re-ran on 50 prompts × 200 tokens and watched the gains shrink to "modest" and on out-of-domain prompts, recurrence actually made things worse. No universal winner. And none of it is new: it's Adaptive Computation Time (2016), the Universal Transformer (2018), and LoopViT (2026), recombined and measured honestly. The real lesson: A number from 8 prompts is a rumor. The eval harness that kills your own best result is worth more than the result it kills. Cite your lineage. Stay preliminary until multiple seeds say otherwise. The three models are live. The write-up is honest about every caveat 👇 🔗 https://huggingface.co/blog/RDTvlokip/teaching-a-15m-french-llm-to-think-deeper

reacted to RDTvlokip's post with 🔥 40 minutes ago

I finally changed the architecture of my 15M French LLM. It worked. Then I almost fooled myself about how much and catching that was the real win. After proving last time that architecture is a threshold, not a lever, I got stubborn: could I change how the model learns? Four honest attempts, Lion, a sharper AdamW β2, multi-token prediction, LayerScale. Four failures. The bottleneck wasn't the learning rule either. So I changed the shape of the computation instead: loop the same transformer blocks 4×, deeper reasoning, zero added parameters. It beat the baseline on perplexity, the first thing in the whole project to move that number. Then I added my own twist: let each token decide how deep to think, halting on its own entropy. My first evaluation was spectacular. Coherence up 65%. Hallucinated names down 62%. It was noise. Eight prompts, one seed. I re-ran on 50 prompts × 200 tokens and watched the gains shrink to "modest" and on out-of-domain prompts, recurrence actually made things worse. No universal winner. And none of it is new: it's Adaptive Computation Time (2016), the Universal Transformer (2018), and LoopViT (2026), recombined and measured honestly. The real lesson: A number from 8 prompts is a rumor. The eval harness that kills your own best result is worth more than the result it kills. Cite your lineage. Stay preliminary until multiple seeds say otherwise. The three models are live. The write-up is honest about every caveat 👇 🔗 https://huggingface.co/blog/RDTvlokip/teaching-a-15m-french-llm-to-think-deeper

View all activity

Organizations

Posts 5

Post

4786

🌐 We crawled the entirety of Hugging Face to help the community! Huge thanks to the Hugging Face API 🌐
🤖 2.91M model repos (file names included), 📚 1.02M dataset repos, 🚀 1.31M Space repos
🤗 617,501 committers (datasets and models), we’ll share Hugging Face statistics with you in the coming days..

We also identified 61,398 users with “AI/ML Interests”, and NOW we can find each other through our “AI/ML Interests”🤗
HF-Collab-Center/Searching-For-HuggingFace-Users
HF-Collab-Center/All-Model-Repos
HF-Collab-Center/All-Dataset-Repos
HF-Collab-Center/All-Space-Repos

HF-Collab-Center/HF-Users
HF-Collab-Center/HF-Users-with-last-seen
HF-Collab-Center/HF-Users-With-AI-ML-Interests-Only

Made By @QuantaSparkLabs and @PhysiQuanty
C'est français, bon.. en anglais.. mais c'est français ;)

Post

4623

🧬 You can now find out whether your cognitive soulmate has already existed among 50k anonymized profiles ✨

SpiceeChat/Check-If-Your-Soulmate-Has-Already-Existed
SpiceeChat/OkCupid-59k-Anonymized-Profiles
https://dating-fatigue.com/

You seek them: 79.7% | They may seek you: 84.1% (coming soon)

🔥 Powered by open source and too much coffee 🔥

spaces 4

Recherche d'Utilisateur HF FRANCAIS

Search Hugging Face users by AI/ML interests and activity

Searching For HuggingFace Users

Search Hugging Face profiles by AI/ML interests

Binary-LLM-POC (base2)

Chat with a binary‑encoded language model

Patent-Test-AutoTokenizer-SFT (base 65536)

Chat with a fine‑tuned language model using custom encoding

models 13

PhysiQuanty/Self-Predicting-Gradient-Descent

Updated 4 days ago

PhysiQuanty/Binary-LLM-POC

Text Generation • 10.7M • Updated May 2 • 23 • 10

PhysiQuanty/Wiki-Test2

79.7M • Updated Mar 4 • 4

PhysiQuanty/Wiki-Test

79.7M • Updated Mar 3 • 2 • 1

PhysiQuanty/Patenty-Test2-Radix-65536

79.7M • Updated Mar 2 • 2

PhysiQuanty/VOCAB-4294967296-FOUR-LOGITS-256

13.7M • Updated Feb 25 • 2 • 2

PhysiQuanty/Patent-Dual-Cross-Entropie

46.5M • Updated Feb 24 • 3

PhysiQuanty/Patent-Test-Radix-65536-AutoTokenizer_FineTune

79.7M • Updated Feb 23 • 3

PhysiQuanty/Patenty-0.1

79.7M • Updated Feb 22 • 2

PhysiQuanty/Patenty1-0.1B

79.7M • Updated Feb 19 • 2

datasets 32

PhysiQuanty/LIGO-VIRGO-O4b-16KHZ

Viewer • Updated 9 days ago • 1.32B • 196 • 1

PhysiQuanty/FRENCH-ONLY-Common-Crawl-2026-25

Viewer • Updated 9 days ago • 8.04M • 571 • 2

PhysiQuanty/Supervision-PLF-PLR-Prevision-Execution-2019

Viewer • Updated 14 days ago • 3.07k • 27

PhysiQuanty/Supervision-Projet-Loi-Finance-Prevision-Realisation

Updated 16 days ago • 28

PhysiQuanty/Catalogue-Data-Sud

Viewer • Updated 20 days ago • 2.29k • 35

PhysiQuanty/merge_brevet

Updated 25 days ago • 20

PhysiQuanty/DVF-2014-Region-Sud

Viewer • Updated about 1 month ago • 194k • 11

PhysiQuanty/DVF-2019-Region-Sud

Viewer • Updated about 1 month ago • 282k • 13

PhysiQuanty/BaseN_Merge_Tok

Updated Mar 8 • 12

PhysiQuanty/BaseN_Merge

Viewer • Updated Mar 8 • 150k • 17

View 32 datasets