Community Blog & Articles

Community Articles

Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

Specialization Beats Scale: A Strategic Variable Most AI Procurement Decisions Overlook

OlmoEarth v1.1: A more efficient family of Earth observation models

Introducing the Ettin Reranker Family

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

The Open Agent Leaderboard

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

Unlocking asynchronicity in continuous batching

Building Blocks for Foundation Model Training and Inference on AWS

vLLM V0 to V1: Correctness Before Corrections in RL

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

Granite 4.1 LLMs: How They’re Built

DeepInfra on Hugging Face Inference Providers 🔥

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

NEW Articles from Team or Enterprise organizations will get promoted to the main section.

Community Blog & Articles

LeRobot Humanoid: An Open, Low-Cost, 3D-Printed Humanoid for Robot Learning

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Software Forgets: Agent Traces Are the Memory

Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Frontier Scores Without a Single Gradient Step

Why Open Models Are the Only Sustainable Way to Teach AI

Uncensor any LLM with abliteration

KV Caching Explained: Optimizing Transformer Inference Efficiency

EMO: Pretraining mixture of experts for emergent modularity

Small Language Models (SLM): A Comprehensive Overview

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

Two Years of Local AI on a Laptop: When Open Models Outpaced Moore's Law

An experiment with attention.

Introduction to State Space Models (SSM)

How to run Gemini Nano locally in your browser

Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚

Code a simple RAG from scratch

Mastering Tensor Dimensions in Transformers

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Norm-Preserving Biprojected Abliteration

Deriving the PPO Loss from First Principles

Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

Specialization Beats Scale: A Strategic Variable Most AI Procurement Decisions Overlook

OlmoEarth v1.1: A more efficient family of Earth observation models

Introducing the Ettin Reranker Family

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

The Open Agent Leaderboard

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

Unlocking asynchronicity in continuous batching

Building Blocks for Foundation Model Training and Inference on AWS

vLLM V0 to V1: Correctness Before Corrections in RL

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

Granite 4.1 LLMs: How They’re Built

DeepInfra on Hugging Face Inference Providers 🔥

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

LeRobot Humanoid: An Open, Low-Cost, 3D-Printed Humanoid for Robot Learning

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Software Forgets: Agent Traces Are the Memory

Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Frontier Scores Without a Single Gradient Step

Why Open Models Are the Only Sustainable Way to Teach AI

Uncensor any LLM with abliteration

KV Caching Explained: Optimizing Transformer Inference Efficiency

EMO: Pretraining mixture of experts for emergent modularity

Small Language Models (SLM): A Comprehensive Overview

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

Two Years of Local AI on a Laptop: When Open Models Outpaced Moore's Law

An experiment with attention.

Introduction to State Space Models (SSM)

How to run Gemini Nano locally in your browser

Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚

Code a simple RAG from scratch

Mastering Tensor Dimensions in Transformers

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Norm-Preserving Biprojected Abliteration

Deriving the PPO Loss from First Principles