2 45 4

Rafael Coelho de Souza Krzonkalla

krzonkalla

AI & ML interests

None yet

Recent Activity

updated a model about 18 hours ago

krzonkalla/Rio_2_14B

updated a model 17 days ago

krzonkalla/rio_2.0_nothink_exp

updated a model 17 days ago

krzonkalla/rio_2.0_detective_exp

View all activity

Organizations

None yet

updated a model about 18 hours ago

krzonkalla/Rio_2_14B

Text Generation • 15B • Updated about 18 hours ago • 683 • 1

updated 3 models 17 days ago

upvoted a paper 17 days ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 29 days ago • 47

updated 2 models 21 days ago

krzonkalla/rio-2-video-vl

Video-Text-to-Text • 849k • Updated 21 days ago • 24

krzonkalla/rio-2-ocr

Image-to-Text • 8B • Updated 21 days ago • 33 • 1

upvoted a paper 21 days ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published 22 days ago • 18

upvoted 4 papers 22 days ago

Accelerating Vision Transformers with Adaptive Patch Sizes

Paper • 2510.18091 • Published 25 days ago • 4

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published 23 days ago • 26

olmOCR 2: Unit Test Rewards for Document OCR

Paper • 2510.19817 • Published 23 days ago • 13

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published 24 days ago • 82

upvoted a paper 23 days ago

Extracting alignment data in open models

Paper • 2510.18554 • Published 24 days ago • 8

updated a model 24 days ago

krzonkalla/test-voice-nano

Updated 24 days ago • 11

published a model 24 days ago

krzonkalla/test-voice-nano

Updated 24 days ago • 11

liked a model 27 days ago

krzonkalla/Rio_2_14B

Text Generation • 15B • Updated about 18 hours ago • 683 • 1

upvoted a paper 27 days ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published 29 days ago • 102

upvoted a paper 29 days ago

The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published 30 days ago • 30

updated a model 29 days ago

krzonkalla/Rio_2_14B

Text Generation • 15B • Updated about 18 hours ago • 683 • 1

upvoted a paper 29 days ago

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published 30 days ago • 56

Rafael Coelho de Souza Krzonkalla

AI & ML interests

Recent Activity

Organizations

krzonkalla's activity