Arthur Zucker's picture

Arthur Zucker

ArthurZ

·

AI & ML interests

None yet

Recent Activity

liked a Space 2 days ago

nanotron/ultrascale-playbook

liked a model 28 days ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

liked a model about 1 month ago

rednote-hilab/dots.llm1.base

View all activity

Organizations

liked a Space 2 days ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

liked a model 28 days ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Text Generation • Updated 28 days ago • 455k • • 798

liked a model about 1 month ago

rednote-hilab/dots.llm1.base

Text Generation • Updated about 14 hours ago • 1.22k • 53

upvoted a changelog about 1 month ago

Changelog

Static Spaces can now have a build step

May 23

• 105

liked a Space about 1 month ago

Beam Search Visualizer

View how beam search decoding works, in detail!

New activity in mistralai/Devstral-Small-2505 about 1 month ago

Adding transformers tag for better tracking of library

#2 opened about 1 month ago by

upvoted an article about 1 month ago

Article

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

By

and 5 others •

May 21

• 28

liked a model about 1 month ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated Dec 21, 2024 • 952k • • 2.4k

New activity in meta-llama/Llama-4-Scout-17B-16E-Instruct about 1 month ago

No attribute `sliding_window`?

#59 opened 3 months ago by

Does LLama4 have chunked attention in generation phase ?

#64 opened 2 months ago by

liked a Space about 1 month ago

Support

Display recent discussions and releases about Transformers

published an article about 1 month ago

Article

The Transformers Library: standardizing model definitions

By

and 3 others •

May 15

• 114

liked a model about 1 month ago

kernels-community/paged-attention

Updated May 6 • 1

liked a dataset about 1 month ago

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 513k • 780

updated a model about 2 months ago

ArthurZ/Dia-1.6B

Updated May 8 • 25 • 1

liked a model about 2 months ago

ArthurZ/Dia-1.6B

Updated May 8 • 25 • 1

published a model about 2 months ago

ArthurZ/Dia-1.6B

Updated May 8 • 25 • 1

liked a Space about 2 months ago

Dia 1.6B

Generate realistic dialogue from a script, using Dia!

liked 2 models about 2 months ago

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated May 22 • 965k • 1.16k

allenai/OLMo-2-0425-1B

Text Generation • Updated 28 days ago • 36.8k • 48