12 9 3

Gaotang Li

gaotang

https://gaotangli.github.io/

GaotangLi

AI & ML interests

None yet

Recent Activity

new activity 20 days ago

gaotang/ParaConfilct:Add task category and link to code

updated a dataset 21 days ago

gaotang/ParaConfilct

upvoted a paper 26 days ago

MIRIX: Multi-Agent Memory System for LLM-Based Agents

View all activity

Organizations

None yet

Collections 2

Papers 3

arxiv:2506.06444

arxiv:2505.02387

arxiv:2503.10996

models 10

datasets 28

gaotang/ParaConfilct

Viewer • Updated 20 days ago • 2.15k • 128

gaotang/RM-R1-Reasoning-RLVR

Viewer • Updated May 20 • 73k • 53

gaotang/RM-R1-Entire-RLVR-Train

Viewer • Updated May 20 • 73k • 78 • 2

gaotang/RM-R1-after-Distill-RLVR

Viewer • Updated May 20 • 64.2k • 127 • 1

gaotang/RM-R1-Distill-SFT

Viewer • Updated May 20 • 8.75k • 130 • 2

gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight_rest_0417

Viewer • Updated Apr 17 • 64.2k • 11

gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight

Viewer • Updated Apr 17 • 73k • 6

gaotang/filtered_sky_code_8k_math_10k_rubric_reasoning

Viewer • Updated Apr 14 • 73k • 28

gaotang/filtered_sky_code_8k_math_10k_rubric_sft

Viewer • Updated Apr 11 • 73k • 8

gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify

Viewer • Updated Apr 11 • 73k • 7

View 28 datasets

Gaotang Li

AI & ML interests

Recent Activity

Organizations

Collections 2

RM-R1: Reward Modeling as Reasoning

gaotang/RM-R1-Entire-RLVR-Train

gaotang/RM-R1-Reasoning-RLVR

gaotang/RM-R1-Distill-SFT

gaotang/ParaConfilct

Taming Knowledge Conflicts in Language Models

RM-R1: Reward Modeling as Reasoning

gaotang/RM-R1-Entire-RLVR-Train

gaotang/RM-R1-Reasoning-RLVR

gaotang/RM-R1-Distill-SFT

gaotang/ParaConfilct

Taming Knowledge Conflicts in Language Models

Papers 3

models 10

gaotang/RM-R1-DeepSeek-Distilled-Qwen-7B

gaotang/RM-R1-Qwen2.5-Instruct-7B

gaotang/RM-R1-DeepSeek-Distilled-Qwen-14B

gaotang/RM-R1-Qwen2.5-Instruct-14B

gaotang/RM-R1-Qwen2.5-Instruct-32B

gaotang/RM-R1-DeepSeek-Distilled-Qwen-32B

gaotang/qwen_7b_sky_filtered_code8k_math_10k_distilled_Claude_o3_0419

gaotang/qwen_7b_sky_filtered_code8k_math_10k_distilled_OpenAI

gaotang/qwen_14b_sky_filtered_code8k_math_10k_distilled_OpenAI

gaotang/qwen2.5_14B_LR1.0e-6_evidence_rubric_4k2k_separate_reward_function

datasets 28

gaotang/ParaConfilct

gaotang/RM-R1-Reasoning-RLVR

gaotang/RM-R1-Entire-RLVR-Train

gaotang/RM-R1-after-Distill-RLVR

gaotang/RM-R1-Distill-SFT

gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight_rest_0417

gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight

gaotang/filtered_sky_code_8k_math_10k_rubric_reasoning

gaotang/filtered_sky_code_8k_math_10k_rubric_sft

gaotang/filtered_sky_code_8k_math_10k_rubric_evidence_classify

Gaotang Li

AI & ML interests

Recent Activity

Organizations

Collections 2

Papers 3

models 10 Sort: Recently updated

datasets 28 Sort: Recently updated

models 10

datasets 28