Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -21,7 +21,6 @@ Built on the Ling 2.0 architecture, Ling-1T is designed to push the limits of *e 
     | 
|
| 21 | 
         
             
            Pre-trained on **20 trillion+ high-quality, reasoning-dense tokens**, Ling-1T-base supports up to **128 K context length** and adopts an **evolutionary chain-of-thought (Evo-CoT)** process across mid-training and post-training.
         
     | 
| 22 | 
         
             
            This curriculum greatly enhances the model’s efficiency and reasoning depth, allowing Ling-1T to achieve **state-of-the-art performance** on multiple complex reasoning benchmarks—balancing **accuracy** and **efficiency**.
         
     | 
| 23 | 
         | 
| 24 | 
         
            -
            ---
         
     | 
| 25 | 
         | 
| 26 | 
         
             
            ### Flagship-Level Efficient Reasoning
         
     | 
| 27 | 
         | 
| 
         @@ -30,7 +29,6 @@ Across code generation, software development, competition-level mathematics, pro 
     | 
|
| 30 | 
         | 
| 31 | 
         
             
            In the **AIME 25** benchmark, Ling-1T extends the **Pareto frontier** of reasoning accuracy vs. reasoning length, showcasing its strength in **“efficient thinking and precise reasoning.”**
         
     | 
| 32 | 
         | 
| 33 | 
         
            -
            ---
         
     | 
| 34 | 
         | 
| 35 | 
         
             
            ### Aesthetic Understanding and Front-End Generation
         
     | 
| 36 | 
         | 
| 
         @@ -38,7 +36,6 @@ Ling-1T excels in visual reasoning and front-end code generation tasks, combinin 
     | 
|
| 38 | 
         
             
            We introduce a hybrid *Syntax–Function–Aesthetics* reward mechanism, enabling the model to not only generate correct and functional code but also demonstrate a refined sense of **visual aesthetics**.
         
     | 
| 39 | 
         
             
            On **ArtifactsBench**, Ling-1T ranks **first among open-source models**, and the benchmark visualizations in this card were, in fact, *generated by Ling-1T itself*.
         
     | 
| 40 | 
         | 
| 41 | 
         
            -
            ---
         
     | 
| 42 | 
         | 
| 43 | 
         
             
            ### Emergent Intelligence at Trillion-Scale
         
     | 
| 44 | 
         | 
| 
         @@ -53,7 +50,6 @@ Ling-1T can: 
     | 
|
| 53 | 
         | 
| 54 | 
         
             
            These capabilities form the foundation for **general, collaborative human–AI intelligence**, which we aim to advance together with the open-source community through Ling-1T’s release.
         
     | 
| 55 | 
         | 
| 56 | 
         
            -
            ---
         
     | 
| 57 | 
         | 
| 58 | 
         
             
            ### Pre-Training at Trillion Scale
         
     | 
| 59 | 
         | 
| 
         @@ -76,18 +72,16 @@ Pre-training used over **20 T high-quality tokens**, with **> 40 % reasoning-den 
     | 
|
| 76 | 
         
             
            Mid-training introduced **curated chain-of-thought corpora** for “**reasoning pre-activation**”, improving downstream reasoning stability.
         
     | 
| 77 | 
         
             
            A custom **WSM (Warmup–Stable–Merge)** LR scheduler with mid-train checkpoint merging simulates LR decay and boosts generalization.
         
     | 
| 78 | 
         | 
| 79 | 
         
            -
            ---
         
     | 
| 80 | 
         | 
| 81 | 
         
             
            ### Post-Training and Evo-CoT Optimization
         
     | 
| 82 | 
         | 
| 83 | 
         
             
            Built upon mid-training reasoning activation, post-training adopts **Evo-CoT (Evolutionary Chain-of-Thought)** for progressive reasoning enhancement under controllable cost.
         
     | 
| 84 | 
         
             
            This approach continually expands the **Pareto frontier** of reasoning accuracy vs. efficiency—ideal for reflexive non-thinking models.
         
     | 
| 85 | 
         | 
| 86 | 
         
            -
            For reinforcement learning, we introduce **LPO (Linguistics-Unit Policy Optimization) 
     | 
| 87 | 
         
             
            Unlike GRPO (token-level) or GSPO (sequence-level) algorithms, LPO treats *sentences* as the natural semantic action units, enabling precise alignment between rewards and reasoning behavior.
         
     | 
| 88 | 
         
             
            Empirically, LPO offers superior **training stability** and **generalization** across reasoning tasks.
         
     | 
| 89 | 
         | 
| 90 | 
         
            -
            ---
         
     | 
| 91 | 
         | 
| 92 | 
         
             
            ## Evaluation
         
     | 
| 93 | 
         | 
| 
         | 
|
| 21 | 
         
             
            Pre-trained on **20 trillion+ high-quality, reasoning-dense tokens**, Ling-1T-base supports up to **128 K context length** and adopts an **evolutionary chain-of-thought (Evo-CoT)** process across mid-training and post-training.
         
     | 
| 22 | 
         
             
            This curriculum greatly enhances the model’s efficiency and reasoning depth, allowing Ling-1T to achieve **state-of-the-art performance** on multiple complex reasoning benchmarks—balancing **accuracy** and **efficiency**.
         
     | 
| 23 | 
         | 
| 
         | 
|
| 24 | 
         | 
| 25 | 
         
             
            ### Flagship-Level Efficient Reasoning
         
     | 
| 26 | 
         | 
| 
         | 
|
| 29 | 
         | 
| 30 | 
         
             
            In the **AIME 25** benchmark, Ling-1T extends the **Pareto frontier** of reasoning accuracy vs. reasoning length, showcasing its strength in **“efficient thinking and precise reasoning.”**
         
     | 
| 31 | 
         | 
| 
         | 
|
| 32 | 
         | 
| 33 | 
         
             
            ### Aesthetic Understanding and Front-End Generation
         
     | 
| 34 | 
         | 
| 
         | 
|
| 36 | 
         
             
            We introduce a hybrid *Syntax–Function–Aesthetics* reward mechanism, enabling the model to not only generate correct and functional code but also demonstrate a refined sense of **visual aesthetics**.
         
     | 
| 37 | 
         
             
            On **ArtifactsBench**, Ling-1T ranks **first among open-source models**, and the benchmark visualizations in this card were, in fact, *generated by Ling-1T itself*.
         
     | 
| 38 | 
         | 
| 
         | 
|
| 39 | 
         | 
| 40 | 
         
             
            ### Emergent Intelligence at Trillion-Scale
         
     | 
| 41 | 
         | 
| 
         | 
|
| 50 | 
         | 
| 51 | 
         
             
            These capabilities form the foundation for **general, collaborative human–AI intelligence**, which we aim to advance together with the open-source community through Ling-1T’s release.
         
     | 
| 52 | 
         | 
| 
         | 
|
| 53 | 
         | 
| 54 | 
         
             
            ### Pre-Training at Trillion Scale
         
     | 
| 55 | 
         | 
| 
         | 
|
| 72 | 
         
             
            Mid-training introduced **curated chain-of-thought corpora** for “**reasoning pre-activation**”, improving downstream reasoning stability.
         
     | 
| 73 | 
         
             
            A custom **WSM (Warmup–Stable–Merge)** LR scheduler with mid-train checkpoint merging simulates LR decay and boosts generalization.
         
     | 
| 74 | 
         | 
| 
         | 
|
| 75 | 
         | 
| 76 | 
         
             
            ### Post-Training and Evo-CoT Optimization
         
     | 
| 77 | 
         | 
| 78 | 
         
             
            Built upon mid-training reasoning activation, post-training adopts **Evo-CoT (Evolutionary Chain-of-Thought)** for progressive reasoning enhancement under controllable cost.
         
     | 
| 79 | 
         
             
            This approach continually expands the **Pareto frontier** of reasoning accuracy vs. efficiency—ideal for reflexive non-thinking models.
         
     | 
| 80 | 
         | 
| 81 | 
         
            +
            For reinforcement learning, we introduce **LPO (Linguistics-Unit Policy Optimization)** —a novel sentence-level policy optimization method.
         
     | 
| 82 | 
         
             
            Unlike GRPO (token-level) or GSPO (sequence-level) algorithms, LPO treats *sentences* as the natural semantic action units, enabling precise alignment between rewards and reasoning behavior.
         
     | 
| 83 | 
         
             
            Empirically, LPO offers superior **training stability** and **generalization** across reasoning tasks.
         
     | 
| 84 | 
         | 
| 
         | 
|
| 85 | 
         | 
| 86 | 
         
             
            ## Evaluation
         
     | 
| 87 | 
         |