Gaetan Lopez

gaetanlop

gaetanlop

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Group Sequence Policy Optimization

upvoted an article 26 days ago

SmolLM3: smol, multilingual, long-context reasoner

upvoted an article about 1 month ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

View all activity

Organizations

None yet

upvoted a paper 2 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 13 days ago • 267

upvoted an article 26 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

29 days ago

• 611

upvoted an article about 1 month ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 196

upvoted an article about 2 months ago

Article

KV Cache from scratch in nanoVLM

and 4 others •

Jun 4

• 89

upvoted 2 articles 3 months ago

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

•

Apr 18

• 40

Article

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

and 5 others •

Aug 21, 2024

• 39

upvoted an article 4 months ago

Article

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

•

Mar 17

• 324

upvoted 2 articles 5 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 295

Article

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

and 3 others •

Mar 10

• 146

upvoted 6 articles 6 months ago

Article

Process Reinforcement through Implicit Rewards

and 1 other •

Jan 3

• 29

Article

SmolLM - blazingly fast and remarkably powerful

and 2 others •

Jul 16, 2024

• 403

Article

1 Billion Classifications

•

Feb 13

• 43

Article

Open-R1: Update #1

and 7 others •

Feb 2

• 305

Article

Open-R1: a fully open reproduction of DeepSeek-R1

and 2 others •

Jan 28

• 876

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

•

Jan 20

• 69

upvoted a paper 7 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 100

upvoted an article 10 months ago

Article

A Complete Guide to Audio Datasets

•

Dec 15, 2022

• 38

upvoted a paper 10 months ago

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

Paper • 2410.07985 • Published Oct 10, 2024 • 33

updated 2 datasets 10 months ago

gaetanlop/openai-prm800k-15k-stage2-conversational

Viewer • Updated Oct 13, 2024 • 17.7k • 13

gaetanlop/openai-prm800k-15k-stage2

Viewer • Updated Oct 13, 2024 • 17.7k • 12

Gaetan Lopez

AI & ML interests

Recent Activity

Organizations

gaetanlop's activity

SmolLM3: smol, multilingual, long-context reasoner

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

KV Cache from scratch in nanoVLM

Gotchas in Tokenizer Behavior Every Developer Should Know

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Open R1: Update #3

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

Process Reinforcement through Implicit Rewards

SmolLM - blazingly fast and remarkably powerful

1 Billion Classifications

Open-R1: Update #1

Open-R1: a fully open reproduction of DeepSeek-R1

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

A Complete Guide to Audio Datasets