inclusionAI
/

Ling-1T

@@ -22,11 +22,22 @@ This curriculum greatly enhances the model’s efficiency and reasoning depth, a
 ### Flagship-Level Efficient Reasoning
 We comprehensively evaluated Ling-1T against leading flagship models, including both **open-source giants** (e.g., *DeepSeek-V3.1-Terminus*, *Kimi-K2-Instruct-0905*) and **closed-source APIs** (*GPT-5-main*, *Gemini-2.5-Pro*).
 Across code generation, software development, competition-level mathematics, professional math, and logical reasoning, Ling-1T consistently demonstrates **superior complex reasoning ability** and overall advantage.
 In the **AIME 25** benchmark, Ling-1T extends the **Pareto frontier** of reasoning accuracy vs. reasoning length, showcasing its strength in **“efficient thinking and precise reasoning.”**
 ### Aesthetic Understanding and Front-End Generation
@@ -66,6 +77,10 @@ FP8 mixed-precision training yields **15 %+ end-to-end speedup**, improved memor
 A fine-grained, **heterogeneous 1F1B interleaved pipeline** further boosts utilization by 40 %+.
 System-level optimizations—fused kernels, communication scheduling, recomputation, checkpointing, simulation, and telemetry—ensure stable trillion-scale training.
 Pre-training used over **20 T high-quality tokens**, with **> 40 % reasoning-dense data** in later stages.
 Mid-training introduced **curated chain-of-thought corpora** for “**reasoning pre-activation**”, improving downstream reasoning stability.
 A custom **WSM (Warmup–Stable–Merge)** LR scheduler with mid-train checkpoint merging simulates LR decay and boosts generalization.
@@ -80,6 +95,12 @@ For reinforcement learning, we introduce **LPO (Linguistics-Unit Policy Optimiza
 Unlike GRPO (token-level) or GSPO (sequence-level) algorithms, LPO treats *sentences* as the natural semantic action units, enabling precise alignment between rewards and reasoning behavior.
 Empirically, LPO offers superior **training stability** and **generalization** across reasoning tasks.
 ## Evaluation

 ### Flagship-Level Efficient Reasoning
+<p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_bcz3yt/afts/img/X7mZSJQX_fsAAAAAT_AAAAgADkV7AQFr/original"/>
+<p>
+<p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_bcz3yt/afts/img/DZ1kSKT57J0AAAAAUOAAAAgADkV7AQFr/original"/>
+<p>
 We comprehensively evaluated Ling-1T against leading flagship models, including both **open-source giants** (e.g., *DeepSeek-V3.1-Terminus*, *Kimi-K2-Instruct-0905*) and **closed-source APIs** (*GPT-5-main*, *Gemini-2.5-Pro*).
 Across code generation, software development, competition-level mathematics, professional math, and logical reasoning, Ling-1T consistently demonstrates **superior complex reasoning ability** and overall advantage.
 In the **AIME 25** benchmark, Ling-1T extends the **Pareto frontier** of reasoning accuracy vs. reasoning length, showcasing its strength in **“efficient thinking and precise reasoning.”**
+<p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_bcz3yt/afts/img/CNhVT4sGM0kAAAAAciAAAAgADkV7AQFr/original"/>
+<p>
 ### Aesthetic Understanding and Front-End Generation
 A fine-grained, **heterogeneous 1F1B interleaved pipeline** further boosts utilization by 40 %+.
 System-level optimizations—fused kernels, communication scheduling, recomputation, checkpointing, simulation, and telemetry—ensure stable trillion-scale training.
+<p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_bcz3yt/afts/img/StIxTrsy-_MAAAAAVTAAAAgADkV7AQFr/original"/>
+<p>
 Pre-training used over **20 T high-quality tokens**, with **> 40 % reasoning-dense data** in later stages.
 Mid-training introduced **curated chain-of-thought corpora** for “**reasoning pre-activation**”, improving downstream reasoning stability.
 A custom **WSM (Warmup–Stable–Merge)** LR scheduler with mid-train checkpoint merging simulates LR decay and boosts generalization.
 Unlike GRPO (token-level) or GSPO (sequence-level) algorithms, LPO treats *sentences* as the natural semantic action units, enabling precise alignment between rewards and reasoning behavior.
 Empirically, LPO offers superior **training stability** and **generalization** across reasoning tasks.
+<p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_bcz3yt/afts/img/o10CRK8P8hwAAAAAWwAAAAgADkV7AQFr/original"/>
+<p>
+<p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_bcz3yt/afts/img/J7I6QZqI-6AAAAAAZHAAAAgADkV7AQFr/original"/>
+<p>
 ## Evaluation