Xun Huang Profile Photo

Xun Huang

Senior Research Scientist, Adobe Research
Visiting Professor, CMU
Pittsburgh, PA

Email: xuhuang at adobe dot com

Google Scholar | Twitter/X | GitHub | Selected Publications | Teaching

My name is Xun Huang (pronounced as /shuun hwang/). I am a Senior Research Scientist at Adobe and also an adjunct professor at CMU. Prior to joining Adobe, I was a researcher at NVIDIA working on large-scale foundation models for visual Generative AI. I obtained my PhD from the Department of Computer Science at Cornell University, advised by Professor Serge Belongie. During PhD, my research was supported by Adobe Research Fellowship (2019), Snap Research Fellowship (2019), and NVIDIA Graduate Fellowship (2018).

My research interests include:

Selected Publications

CausVid

From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

arXiv, 2024

Tianwei Yin*, Qiang Zhang*, Richard Zhang, William T. Freeman, Fredo Durand, Eli Shechtman, Xun Huang

[arXiv] [Project]

A video generation model that is super fast (~1s initial latency and ~10 FPS streaming generation in real time, on a single GPU) and very high-quality (rank 1st on VBench).

Magic3D

Magic3D: High-Resolution Text-to-3D Content Creation

CVPR 2023 (Highlight)

Chen-Hsuan Lin*, Jun Gao*, Luming Tang*, Towaki Takikawa*, Xiaohui Zeng*, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

[arXiv] [Project] [Video]

The foundation of NVIDIA's text-to-3D generative models.

eDiff-I

eDiff-I: Text-to-Image Diffusion Models with Ensemble of Expert Denoisers

arXiv 2022

Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Qinsheng Zhang, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu

[arXiv] [Project] [Video]

The foundation of NVIDIA's large-scale text-to-image generative models.

PoE-GANs

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

ECCV 2022

Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu

[arXiv] [Project] [Video] [Two Minute Papers]

Also known as "GauGAN2", one of the earliest AI demos that can create high-resolution images from text and other conditions.

PointFlow

PointFlow: 3D Point Cloud Generation with Continuous Normalizing Flows

ICCV 2019 (Oral)

Guandao Yang*, Xun Huang*, Zekun Hao, Ming-Yu Liu, Serge Belongie, Bharath Hariharan (*equal contribution)

[arXiv] [Code] [Video]

Formulating point cloud generation as modeling a "distribution of distributions".

MUNIT

Multimodal Unsupervised Image-to-Image Translation

ECCV 2018

Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz

[arXiv] [Code] [Video]

Cited 3,000+ times. One of the most influential papers in image-to-image translation.

AdaIN

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

ICCV 2017 (Oral)

Xun Huang, Serge Belongie

[arXiv] [Code]

Cited 5,000+ times. The canonical method to inject global information to image/video/audio generative models.

SGAN

Stacked Generative Adversarial Networks

CVPR 2017

Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie

[arXiv] [Code]

A pioneering work that trains generative models in the latent space (instead of the data space), a paradigm widely adopted in modern generative models.

* indicates equal contribution.
See Google Scholar for the full list of publications.

Teaching