Marc Sun's picture

Marc Sun

marcsun13

·

AI & ML interests

LLM, Quantization, Training, Inference

Recent Activity

upvoted an article 3 days ago

Transformers backend integration in SGLang

liked a model 3 days ago

meta-llama/Llama-3.2-1B-Instruct

published an article 3 days ago

Transformers backend integration in SGLang

View all activity

Organizations

upvoted an article 3 days ago

Article

Transformers backend integration in SGLang

By

and 4 others •

3 days ago

• 35

upvoted an article 6 days ago

Article

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

By

and 4 others •

7 days ago

• 62

upvoted a changelog 21 days ago

Changelog

New Inference Providers Dashboard

21 days ago

• 52

upvoted an article 23 days ago

Article

Fine-tuning Llama 2 70B using PyTorch FSDP

By

and 3 others •

Sep 13, 2023

• 24

upvoted a collection about 1 month ago

Flux quantized checkpoints

This collection regroups quantized flux checkpoints that we used in this blogpost: https://huggingface.co/blog/diffusers-quantization • 5 items • Updated May 21 • 1

upvoted 2 articles about 1 month ago

Article

The Transformers Library: standardizing model definitions

By

and 3 others •

May 15

• 114

Article

Exploring Quantization Backends in Diffusers

By

and 2 others •

May 21

• 37

upvoted a collection about 1 month ago

EXL3 models

23 items • Updated 11 days ago • 26

upvoted an article about 2 months ago

Article

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

By

and 8 others •

Apr 29

• 33

upvoted an article 2 months ago

Article

🔥 Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6 times faster)!

By

and 3 others •

Apr 23

• 9

upvoted a collection 2 months ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 27 days ago • 201

upvoted 2 articles 3 months ago

Article

Memory-efficient Diffusion Transformers with Quanto and Diffusers

By

and 1 other •

Jul 30, 2024

• 66

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

By

and 6 others •

Apr 5

• 145

upvoted a collection 3 months ago

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 546

upvoted an article 3 months ago

Article

NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets

By

and 4 others •

Mar 18

• 41

upvoted 2 articles 4 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

By

and 3 others •

Mar 12

• 436

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

By

and 1 other •

Mar 7

• 65

upvoted a paper 7 months ago

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 28

upvoted an article 8 months ago

Article

Fixing Gradient Accumulation

By

and 5 others •

Oct 16, 2024

• 55

upvoted an article 9 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

By

and 5 others •

Sep 18, 2024

• 253