I am now a third-year PhD student in IMPL Lab, Singapore University of Technology and Design (SUTD), fortunately supervised by Prof Na Zhao. I am also supported by the Agency for Science, Technology and Research (A*STAR) SINGA scholarship and am grateful to be supervised by Prof. Xulei Yang. Prior to this, I obtained my Master’s degree from Zhejiang University and worked as a research intern at Alibaba DAMO Academy.

My research interests include multi-modal scene understanding for robotics and autonomous vehicles. Currently, I focus on building a comprehensive understanding of complex scenes by leveraging multi-modal inputs and transferring learned knowledge to address real-world challenges such as domain shift.

Open to work: I am actively looking for research internship and collaboration opportunities in embodied AI, autonomous driving, and multi-modal understanding.

🔥 News

2026.06: I will be attending CVPR 2026 and look forward discussing with you!
2026.02: 🎉🎉🎉 PanDA and CCF are accepted by CVPR 2026!
2026.01: Honored to receive the Best Oral Presentation Runner-up at Singapore ACM SIGKDD Symposium 2026!
2025.05: 🎉🎉🎉 IAL is accepted by ICML 2025!
2025.05: My new homepage is now live!

📝 Publications

A full publication list is available on my google scholar page.

CVPR 2026

[CVPR 2026] PanDA: Unsupervised Domain Adaptation for Multimodal 3D Panoptic Segmentation in Autonomous Driving
Yining Pan, Shijie Li, Yuchen Wu, Xulei Yang, Na Zhao

PanDA studies unsupervised domain adaptation for multimodal 3D panoptic segmentation in autonomous driving.
It combines asymmetric multimodal drop and dual-expert pseudo-label refinement to improve robustness under domain shifts.

CVPR 2026

[CVPR 2026] CCF: Complementary Collaborative Fusion for Domain Generalized Multi-Modal 3D Object Detection
Yuchen Wu, Kun Wang, Yining Pan, Na Zhao
[Project page]

CCF targets domain-generalized multi-modal 3D object detection for autonomous driving.
It rebalances camera and LiDAR queries with query-decoupled loss, LiDAR-guided depth priors, and complementary cross-modal masking.

ICML 2025

[ICML 2025] How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation
Yining Pan, Qiongjie Cui, Xulei Yang, Na Zhao [Project page]

This paper proposes the Image-Assists-LiDAR (IAL) model, which harmonizes LiDAR and images through synchronized augmentation, token fusion, and prior query generation.
IAL achieves SOTA performance on 3D panoptic benchmarks, outperforming baseline methods by over 4%.

CVPR 2024

[CVPR 2024] InstructVideo: Instructing Video Diffusion Models with Human Feedback
H. Yuan, S. Zhang, X. Wang, Y. Wei, T. Feng, Yining Pan, Y. Zhang, Z. Liu, S. Albanie, D. Ni
[Project page]

InstructVideo is the first research attempt that instructs video diffusion models with human feedback.
InstructVideo significantly enhances the visual quality of generated videos without compromising generalization capabilities, with merely 0.1% of the parameters being fine-tuned.

ICCV 2023

[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
H. Yuan, S. Zhang, X. Wang, S. Albanie, Yining Pan, T. Feng, J. Jiang, D. Ni, Y. Zhang, D. Zhao

RLIPv2 elevates RLIP by leveraging a new language-image fusion mechanism, designed for expansive data scales.

🎖 Honors and Awards

Best Oral Presentation Runner-up, Singapore ACM SIGKDD Symposium 2026
Singapore International Graduate Award (SINGA) from A*STAR
ZJU Graduate of Merit, Triple-A Graduate
National Undergraduate Electronics Design Contest, First Prize
China College Students’ ‘Internet+’ Innovation and Entrepreneurship Competition, First Prize
Win the People Scholarship for three consecutive years (Top 1%)