Predict human preference to LLM responses.
Binfeng Xu
billxbf
AI & ML interests
evolving back to apes
Recent Activity
updated a model 5 days ago
billxbf/qwen3.5-4b-pi-polar published a model 5 days ago
billxbf/qwen3.5-4b-pi-polar updated a model 8 days ago
billxbf/qwen3.5-4b-opencode-polarOrganizations
models 21
billxbf/qwen3.5-4b-pi-polar
4B • Updated • 17
billxbf/qwen3.5-4b-opencode-polar
4B • Updated • 60
billxbf/qwen3.5-4b-qwencode-polar
4B • Updated • 109
billxbf/qwen3.5-4b-claudecode-polar
4B • Updated • 13
billxbf/qwen3.5-4b-codex-polar-step72
Reinforcement Learning • 5B • Updated • 23
billxbf/zephyr-7b-dpo-iter1
Text Generation • 274k • Updated • 1
billxbf/zephyr-7b-dpo-iter3
Text Generation • 266k • Updated • 3
billxbf/zephyr-7b-dpo-iter2
Text Generation • 266k • Updated
billxbf/Nano-Raccoon-Preview-1104
425k • Updated • 2
billxbf/zephyr-7b-sft-iter3
Text Generation • 266k • Updated
datasets 20
billxbf/math_pile_v3
Viewer • Updated • 1.52M • 91
billxbf/ultrafeedback-dpo-iter3
Viewer • Updated • 20.4k • 24
billxbf/ultrafeedback-dpo-iter1
Viewer • Updated • 20.4k • 5
billxbf/ultrafeedback-dpo-iter2
Viewer • Updated • 20.4k • 5
billxbf/ultrafeedback-sft-iter3
Viewer • Updated • 20.4k • 7
billxbf/ultrafeedback-sft-iter2
Viewer • Updated • 20.4k • 3
billxbf/ultrafeedback-sft-iter1
Viewer • Updated • 20.4k • 10
billxbf/verified100-chitchat
Viewer • Updated • 100 • 6
billxbf/verified100-lite
Viewer • Updated • 100 • 16
billxbf/verified100
Viewer • Updated • 100 • 6