launch
/

ThinkPRM-14B

Text Generation

generative reward model

process supervision

chain-of-thought

code verification

text-generation-inference

Model card Files Files and versions Community

Add link to code and library name

#2

by nielsr HF Staff - opened about 22 hours ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -1,5 +1,7 @@
 ---
 library_name: transformers
 tags:
 - reward-model
 - prm
@@ -9,8 +11,6 @@ tags:
 - verification
 - math reasoning
 - code verification
-license: apache-2.0
-pipeline_tag: text-generation
 ---
 # Model Card for ThinkPRM-14B
@@ -19,7 +19,6 @@ ThinkPRM-14B is a generative Process Reward Model (PRM) based on the R1-Distill-
 Here's an example of the model output:
 ## Model Details
 ### Model Description

 ---
 library_name: transformers
+license: apache-2.0
+pipeline_tag: text-generation
 tags:
 - reward-model
 - prm
 - verification
 - math reasoning
 - code verification
 ---
 # Model Card for ThinkPRM-14B
 Here's an example of the model output:
 ## Model Details
 ### Model Description