view article Article Open Responses: What you need to know +2 evalstate, burtenshaw, merve, pcuenq • Jan 15 • 111
view article Article Building Deep Research: How we Achieved State of the Art Tavily • Nov 24, 2025 • 36
view article Article ChatML vs Harmony: Understanding the new Format from OpenAI 🔍 kuotient • Aug 9, 2025 • 57
view article Article Remote VAEs for decoding with Inference Endpoints 🤗 hlky, sayakpaul • Feb 24, 2025 • 41
view article Article Welcome to Inference Providers on the Hub 🔥 +5 burkaygur, zeke, aton2006, hassanelmghari, sbrandeis, kramp, julien-c • Jan 28, 2025 • 495
End-to-end speaker segmentation for overlap-aware resegmentation Paper • 2104.04045 • Published Apr 8, 2021 • 2
Training Datasets Collection A collection of pseudo-labelled datasets used to train the Distil-Whisper model. • 9 items • Updated Mar 21, 2024 • 14
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context +6 philschmid, osanseviero, alvarobartt, lvwerra, dvilasuero, reach-vb, marcsun13, pcuenq • Jul 23, 2024 • 241
view article Article Introducing TextImage Augmentation for Document Images +1 danaaubakirova, Molbap, Ternaus • Aug 6, 2024 • 33
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 ybelkada, timdettmers, artidoro, sgugger, smangrul • May 24, 2023 • 180
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes ybelkada, timdettmers • Aug 17, 2022 • 132
Llama 3.1 GPTQ, AWQ, and BNB Quants Collection Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Sep 26, 2024 • 57
view article Article TGI Multi-LoRA: Deploy Once, Serve 30 Models +1 derek-thomas, dmaniloff, drbh • Jul 18, 2024 • 63
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions Paper • 2403.16627 • Published Mar 25, 2024 • 22
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27, 2024 • 96
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 77