Google DeepMind releases DiffusionGemma, a 26B diffusion model 4x faster than autoregressive generation
DiffusionGemma uses parallel text diffusion instead of sequential token generation, achieving 1000+ tokens/sec on H100 GPUs with trade-offs in output quality.