Yuyang Zhao

I am currently a Ph.D. candidate in the Computer Vision and Robotic Perception (CVRP) Laboratory at National University of Singapore, under the supervision of A/P Gim Hee Lee. Currently, I am interning at GenAI, Microsoft. I received my B.E. degree from Tianjin University in 2020.

My research interests lie in AIGC and generalizable computer vision systems. Currently, I am working on controllable 3D and video generation.


profile photo

I’m on the job market and looking for a Research Scientist/Engineer position starting in the summer of 2024. Feel free to reach out if you have any openings!
  • [April 2024] Checkout our new 3D representation and generation framework: X-Ray!
  • [November 2023] Animate124 is released!
  • [September 2023] Two papers about visual domain generalization and parameter efficient fine-tuning are accepted to IJCV!
  • [May 2023] Make-A-Protagonist is released! This is my first step to AIGC.
  • [January 2023] I received the Research Achievement Award by NUS!
  • [September 2022] Our AdvStyle is accepted to NeurIPS 2022!
  • [July 2022] I received the Outstanding Reviewer Award in ICML 2022 (Top 10%) !
  • [July 2022] One paper about domain generalized semantic segmentation is accepted to ECCV 2022!
  • [May 2022] One paper about open compound domain adaptation is accepted to IEEE TCSVT!
  • [March 2022] One paper about novel class discovery is accepted to CVPR 2022!
  • [November 2021] One paper about optical flow estimation is accepted to Neurocomputing.
  • [September 2021] One paper about optical flow estimation is accepted to Signal Processing: Image Communication.
  • [March 2021] One paper about domain generalized person re-identification is accepted to CVPR 2021!

Featured Works
Animate124: Animating One Image to 4D Dynamic Scene
Yuyang Zhao, Zhiwen Yan, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee

The first work to animate a single in-the-wild image into 3D video through textual motion descriptions.

Segment Any 3D Object with Language
Seungjun Lee*, Yuyang Zhao*, Gim Hee Lee
(* Equal contribution)

SOLE is a highly generalizable open-vocabulary instance segmentor and can segment corresponding instances with various language instructions.

X-Ray: A Sequential 3D Representation for Generation
Tao Hu, Wenhang Ge, Yuyang Zhao, Gim Hee Lee
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao, Enze Xie, Lanqing Hong, Zhenguo Li, Gim Hee Lee

The first framework for generic video editing with both visual and textual clues. Make-A-Protagonist can achieve background editing, protagonist editing, and text-to-video editing with protagonist.

Style-Hallucinated Dual Consistency Learning: A Unified Framework for Visual Domain Generalization
Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee
IJCV, 2023

Extension of our ECCV 2022 paper (SHADE). This paper applies SHADE to visual domain generalization tasks, including semantic segmentation with Transformer backbone, image classification, and object detection.

Adversarial Style Augmentation for Domain Generalized Urban-Scene Segmentation
Zhun Zhong*, Yuyang Zhao*, Gim Hee Lee, Nicu Sebe
(* Equal contribution)
NeurIPS, 2022
PDF / Code

AdvStyle adversarially changes the channel-wise mean and standard deviation to diversify source samples.

Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation
Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee
ECCV, 2022

We introduce a dual consistency learning framework for domain generalized semantic segmentation, and propose a style hallucination module to generate pair-wise stylized samples.

Novel Class Discovery in Semantic Segmentation
Yuyang Zhao, Zhun Zhong, Nicu Sebe, Gim Hee Lee
CVPR, 2022

The first work focuses on novel class discovery in semantic segmentation. This work addresses the co-occurrence of base, novel and background classes.

Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification
Yuyang Zhao*, Zhun Zhong*, Fengxiang Yang, Zhiming Luo, Shaozi Li, Nicu Sebe
(* Equal contribution)
CVPR, 2021
PDF / Code
Other Publications / Preprints
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao, Pichao Wang, Yuyang Zhao, Hao Luo, Fan Wang, Mike Zheng Shou
IJCV, 2023
PDF / Code

Two Heads Are Better Than One: Improving Fake News Video Detection by Correlating with Neighbors
Peng Qi, Yuyang Zhao, Yufeng Shen, Wei Ji, Juan Cao, Tat-Seng Chua
ACL Findings, 2023
PDF / Code

Source-Free Open Compound Domain Adaptation in Semantic Segmentation
Yuyang Zhao*, Zhun Zhong*, Zhiming Luo, Gim Hee Lee, Nicu Sebe
(* Equal contribution)
IEEE Transactions on Circuits and Systems for Video Technology, 2022
PDF / Code
Zero-shot Point Cloud Completion Via 2D Priors
Tianxin Huang, Zhiwen Yan, Yuyang Zhao, Gim Hee Lee
PDF / Code

Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor Point Clouds
Yuyang Zhao, Na Zhao, Gim Hee Lee

The first work on domain generalized semantic segmentation in 3D indoor scenes.

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm
Henry Hengyuan Zhao, Hao Luo, Yuyang Zhao, Pichao Wang, Fan Wang, Mike Zheng Shou
Arxiv, 2023
PDF / Code

Professional Service
  • Program Comittee / Conference Reviewer: CVPR, ICCV, ECCV, ICML, NeurIPS, ICLR

  • Research Achievement Award, National University of Singapore, 2023
  • Outstanding Reviewer Award, ICML, 2022
  • Research Scholarship, National University of Singapore, 2021

Stolen from Jon Barron