Hugging Face · AI Engineer

Real Time Video Diffusion on a Single GPU — Ziv Ilan, Nvidia

Jun 16, 2026 · 18:46

Nvidia's Ziv Ilan explains how combining quantization, caching, and step distillation enables near real-time video diffusion on a single Blackwell B200 GPU. Working with Black Forest Labs on Flux 2, dynamic quantization reduces memory and compute, caching skips redundant denoising steps, and distillation cuts steps from fifty to as few as one. The open-source FastGen repo packages these post-training and sharding techniques, achieving 10–200x speedups for real-time generation.

2 episodes