3 14 22

AlphaSue

AI & ML interests

None yet

Recent Activity

upvoted an article 5 days ago

Open-source DeepResearch – Freeing our search agents

new activity 7 days ago

tokyotech-llm/swallow-math:Why the data only has answers without questions?

upvoted a collection about 2 months ago

Whisper

View all activity

Organizations

None yet

upvoted an article 5 days ago

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.28k

upvoted a collection about 2 months ago

Whisper

Collection

OpenAI Whisper speech recognition models in MLX format • 48 items • Updated Oct 1, 2024 • 51

upvoted an article 3 months ago

Article

Vision Language Models (Better, Faster, Stronger)

and 4 others •

May 12

• 495

upvoted a collection 4 months ago

ProX Refining Models

Collection

Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 4

upvoted 3 papers 4 months ago

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

Paper • 2504.10766 • Published Apr 14 • 40

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 57

upvoted an article 4 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 295

upvoted a paper 4 months ago

Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published Mar 21 • 37

upvoted a paper 5 months ago

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation

Paper • 2502.10341 • Published Feb 14 • 3

upvoted an article 6 months ago

Article

Mixture of Experts Explained

and 5 others •

Dec 11, 2023

• 800

upvoted a collection 7 months ago

Papers I've read

Collection

16 items • Updated Jan 12 • 6

upvoted a paper 9 months ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 49

upvoted an article over 1 year ago

Article

Large-scale Near-deduplication Behind BigCode

•

May 16, 2023

• 31

AlphaSue

AI & ML interests

Recent Activity

Organizations

AlphaSue's activity

Open-source DeepResearch – Freeing our search agents

Vision Language Models (Better, Faster, Stronger)

Open R1: Update #3

Mixture of Experts Explained

Large-scale Near-deduplication Behind BigCode