Text Generation
Transformers
Safetensors
bailing_moe
conversational
custom_code
zzqsmall commited on
Commit
1883aba
·
verified ·
1 Parent(s): aad4a41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -7
README.md CHANGED
@@ -21,7 +21,6 @@ Built on the Ling 2.0 architecture, Ling-1T is designed to push the limits of *e
21
  Pre-trained on **20 trillion+ high-quality, reasoning-dense tokens**, Ling-1T-base supports up to **128 K context length** and adopts an **evolutionary chain-of-thought (Evo-CoT)** process across mid-training and post-training.
22
  This curriculum greatly enhances the model’s efficiency and reasoning depth, allowing Ling-1T to achieve **state-of-the-art performance** on multiple complex reasoning benchmarks—balancing **accuracy** and **efficiency**.
23
 
24
- ---
25
 
26
  ### Flagship-Level Efficient Reasoning
27
 
@@ -30,7 +29,6 @@ Across code generation, software development, competition-level mathematics, pro
30
 
31
  In the **AIME 25** benchmark, Ling-1T extends the **Pareto frontier** of reasoning accuracy vs. reasoning length, showcasing its strength in **“efficient thinking and precise reasoning.”**
32
 
33
- ---
34
 
35
  ### Aesthetic Understanding and Front-End Generation
36
 
@@ -38,7 +36,6 @@ Ling-1T excels in visual reasoning and front-end code generation tasks, combinin
38
  We introduce a hybrid *Syntax–Function–Aesthetics* reward mechanism, enabling the model to not only generate correct and functional code but also demonstrate a refined sense of **visual aesthetics**.
39
  On **ArtifactsBench**, Ling-1T ranks **first among open-source models**, and the benchmark visualizations in this card were, in fact, *generated by Ling-1T itself*.
40
 
41
- ---
42
 
43
  ### Emergent Intelligence at Trillion-Scale
44
 
@@ -53,7 +50,6 @@ Ling-1T can:
53
 
54
  These capabilities form the foundation for **general, collaborative human–AI intelligence**, which we aim to advance together with the open-source community through Ling-1T’s release.
55
 
56
- ---
57
 
58
  ### Pre-Training at Trillion Scale
59
 
@@ -76,18 +72,16 @@ Pre-training used over **20 T high-quality tokens**, with **> 40 % reasoning-den
76
  Mid-training introduced **curated chain-of-thought corpora** for “**reasoning pre-activation**”, improving downstream reasoning stability.
77
  A custom **WSM (Warmup–Stable–Merge)** LR scheduler with mid-train checkpoint merging simulates LR decay and boosts generalization.
78
 
79
- ---
80
 
81
  ### Post-Training and Evo-CoT Optimization
82
 
83
  Built upon mid-training reasoning activation, post-training adopts **Evo-CoT (Evolutionary Chain-of-Thought)** for progressive reasoning enhancement under controllable cost.
84
  This approach continually expands the **Pareto frontier** of reasoning accuracy vs. efficiency—ideal for reflexive non-thinking models.
85
 
86
- For reinforcement learning, we introduce **LPO (Linguistics-Unit Policy Optimization)**—a novel sentence-level policy optimization method.
87
  Unlike GRPO (token-level) or GSPO (sequence-level) algorithms, LPO treats *sentences* as the natural semantic action units, enabling precise alignment between rewards and reasoning behavior.
88
  Empirically, LPO offers superior **training stability** and **generalization** across reasoning tasks.
89
 
90
- ---
91
 
92
  ## Evaluation
93
 
 
21
  Pre-trained on **20 trillion+ high-quality, reasoning-dense tokens**, Ling-1T-base supports up to **128 K context length** and adopts an **evolutionary chain-of-thought (Evo-CoT)** process across mid-training and post-training.
22
  This curriculum greatly enhances the model’s efficiency and reasoning depth, allowing Ling-1T to achieve **state-of-the-art performance** on multiple complex reasoning benchmarks—balancing **accuracy** and **efficiency**.
23
 
 
24
 
25
  ### Flagship-Level Efficient Reasoning
26
 
 
29
 
30
  In the **AIME 25** benchmark, Ling-1T extends the **Pareto frontier** of reasoning accuracy vs. reasoning length, showcasing its strength in **“efficient thinking and precise reasoning.”**
31
 
 
32
 
33
  ### Aesthetic Understanding and Front-End Generation
34
 
 
36
  We introduce a hybrid *Syntax–Function–Aesthetics* reward mechanism, enabling the model to not only generate correct and functional code but also demonstrate a refined sense of **visual aesthetics**.
37
  On **ArtifactsBench**, Ling-1T ranks **first among open-source models**, and the benchmark visualizations in this card were, in fact, *generated by Ling-1T itself*.
38
 
 
39
 
40
  ### Emergent Intelligence at Trillion-Scale
41
 
 
50
 
51
  These capabilities form the foundation for **general, collaborative human–AI intelligence**, which we aim to advance together with the open-source community through Ling-1T’s release.
52
 
 
53
 
54
  ### Pre-Training at Trillion Scale
55
 
 
72
  Mid-training introduced **curated chain-of-thought corpora** for “**reasoning pre-activation**”, improving downstream reasoning stability.
73
  A custom **WSM (Warmup–Stable–Merge)** LR scheduler with mid-train checkpoint merging simulates LR decay and boosts generalization.
74
 
 
75
 
76
  ### Post-Training and Evo-CoT Optimization
77
 
78
  Built upon mid-training reasoning activation, post-training adopts **Evo-CoT (Evolutionary Chain-of-Thought)** for progressive reasoning enhancement under controllable cost.
79
  This approach continually expands the **Pareto frontier** of reasoning accuracy vs. efficiency—ideal for reflexive non-thinking models.
80
 
81
+ For reinforcement learning, we introduce **LPO (Linguistics-Unit Policy Optimization)** —a novel sentence-level policy optimization method.
82
  Unlike GRPO (token-level) or GSPO (sequence-level) algorithms, LPO treats *sentences* as the natural semantic action units, enabling precise alignment between rewards and reasoning behavior.
83
  Empirically, LPO offers superior **training stability** and **generalization** across reasoning tasks.
84
 
 
85
 
86
  ## Evaluation
87