Efficient Reasoning via Decoupled Reward Policy Optimization
Gang Li
ganglii
AI & ML interests
None yet
Recent Activity
published
a dataset
about 1 month ago
ganglii/OpenMathReasoning
updated
a dataset
about 1 month ago
ganglii/OpenMathReasoning
updated
a model
about 1 month ago
ganglii/DRPO-7B
Organizations
None yet