Anton Lozhkov's picture

Anton Lozhkov

anton-l

·

AI & ML interests

Generative Models, Distributed Training, Photo and Video Enhancement

Recent Activity

new activity 5 days ago

anton-l/superb_demo:Convert dataset to Parquet

new activity 6 days ago

anton-l/superb_dummy:Convert dataset to Parquet

View all activity

Organizations

upvoted a collection 4 months ago

OpenR1-Math

Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 3 items • Updated May 13 • 9

upvoted an article 5 months ago

Article

Open R1: Update #2

By

and 6 others •

Feb 10

• 214

upvoted a paper 5 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

upvoted a collection 6 months ago

📐 FineMath

FineMath datasets and ablation models • 14 items • Updated May 5 • 20

upvoted a paper 10 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 132

upvoted an article 11 months ago

Article

SmolLM - blazingly fast and remarkably powerful

By

and 2 others •

Jul 16, 2024

• 380

upvoted an article 12 months ago

Article

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

By

and 9 others •

Jun 24, 2024

• 34

upvoted a paper 12 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 98

upvoted a collection about 1 year ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 39

upvoted 2 papers over 1 year ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 147

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122

upvoted a paper about 2 years ago

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Paper • 2306.01116 • Published Jun 1, 2023 • 35