Add paper link to model card (#5)

- Add paper link to model card (797ecea8a3e604c7070e169fc1354b49fce51349)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,15 +1,15 @@
 ---
-license: apache-2.0
-tags:
-- finetuned
-- chat
 language:
 - en
 - ko
 - ja
 - zh
-pipeline_tag: text-generation
 library_name: transformers
 ---
 # Trillion-7B-preview
@@ -22,7 +22,7 @@ library_name: transformers
 ## Introduction
-We introduce Trillion-7B-preview, a preview of our latest large language model designed to push the boundaries of multilingual scalability and performance.
 When comparing performance to training FLOPs for Trillion-7B-preview with competitive models, our model pushes the Pareto frontier, achieving around 66.5% average performance while using significantly fewer compute (~9.3×10²² FLOPs). It outperforms models like Mistral-7B-Instruct-v0.3 and SOLAR-10.7B-Instruct-v1.0 while remaining competitive with models requiring 3-8× more compute such as Qwen2.5-7B-Instruct and EXAONE-3.5-7.8B-Instruct. For full benchmark results, see tables below.
@@ -240,4 +240,4 @@ This model repository is licensed under the Apache-2.0 License.
 }
 ```
 ## Contact
-For inquiries, please contact: info@trillionlabs.co

 ---
 language:
 - en
 - ko
 - ja
 - zh
 library_name: transformers
+license: apache-2.0
+pipeline_tag: text-generation
+tags:
+- finetuned
+- chat
 ---
 # Trillion-7B-preview
 ## Introduction
+We introduce Trillion-7B-preview, a preview of our latest large language model designed to push the boundaries of multilingual scalability and performance.  This model is presented in the paper: [Trillion-7B-preview](https://huggingface.co/papers/2504.15431).
 When comparing performance to training FLOPs for Trillion-7B-preview with competitive models, our model pushes the Pareto frontier, achieving around 66.5% average performance while using significantly fewer compute (~9.3×10²² FLOPs). It outperforms models like Mistral-7B-Instruct-v0.3 and SOLAR-10.7B-Instruct-v1.0 while remaining competitive with models requiring 3-8× more compute such as Qwen2.5-7B-Instruct and EXAONE-3.5-7.8B-Instruct. For full benchmark results, see tables below.
 }
 ```
 ## Contact
+For inquiries, please contact: info@trillionlabs.co