Thomas Wolf's picture

Thomas Wolf PRO

thomwolf

·

https://thomwolf.io

AI & ML interests

NLP and open-source :-)

Recent Activity

liked a Space 3 days ago

cfahlgren1/org-activity-heatmap

liked a dataset 6 days ago

zello/zello-public-channels-voice-sample

liked a Space 7 days ago

Agents-MCP-Hackathon/ShallowCodeResearch

View all activity

Organizations

authored a paper 23 days ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published 23 days ago • 100

authored 2 papers 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 191

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2 • 21

authored 2 papers 5 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 64

authored a paper 12 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 98

authored 13 papers over 1 year ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 147

The Stack: 3 TB of permissively licensed source code

Paper • 2211.15533 • Published Nov 20, 2022 • 5

BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model

Paper • 2212.04960 • Published Dec 9, 2022 • 1

Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning

Paper • 2302.02662 • Published Feb 6, 2023 • 1

TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation

Paper • 2003.11963 • Published Mar 26, 2020

Training Transformers Together

Paper • 2207.03481 • Published Jul 7, 2022 • 5

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 220

FinGPT: Large Generative Models for a Small Language

Paper • 2311.05640 • Published Nov 3, 2023 • 32

Distributed Deep Learning in Open Collaborations

Paper • 2106.10207 • Published Jun 18, 2021 • 2

Learning from others' mistakes: Avoiding dataset biases without modeling them

Paper • 2012.01300 • Published Dec 2, 2020

A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

Paper • 1811.06031 • Published Nov 14, 2018

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Paper • 2005.07683 • Published May 15, 2020

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 122

authored a paper about 2 years ago

Scaling Data-Constrained Language Models

Paper • 2305.16264 • Published May 25, 2023 • 17