I am now a third-year PhD student in IMPL Lab, Singapore University of Technology and Design (SUTD), fortunately supervised by Prof Na Zhao. I am also supported by the Agency for Science, Technology and Research (A*STAR) SINGA scholarship and am grateful to be supervised by Prof. Xulei Yang. Prior to this, I obtained my Master’s degree from Zhejiang University and worked as a research intern at Alibaba DAMO Academy.
My research interests include multi-modal scene understanding for robotics and autonomous vehicles. Currently, I focus on building a comprehensive understanding of complex scenes by leveraging multi-modal inputs and transferring learned knowledge to address real-world challenges such as domain shift.
Open to work: I am actively looking for research internship and collaboration opportunities in embodied AI, autonomous driving, and multi-modal understanding.
🔥 News
- 2026.06: I will be attending CVPR 2026 and look forward discussing with you!
- 2026.02: 🎉🎉🎉 PanDA and CCF are accepted by CVPR 2026!
- 2026.01: Honored to receive the Best Oral Presentation Runner-up at Singapore ACM SIGKDD Symposium 2026!
- 2025.05: 🎉🎉🎉 IAL is accepted by ICML 2025!
- 2025.05: My new homepage is now live!
📝 Publications
A full publication list is available on my google scholar page.

[CVPR 2026] PanDA: Unsupervised Domain Adaptation for Multimodal 3D Panoptic Segmentation in Autonomous Driving
Yining Pan, Shijie Li, Yuchen Wu, Xulei Yang, Na Zhao
- PanDA studies unsupervised domain adaptation for multimodal 3D panoptic segmentation in autonomous driving.
- It combines asymmetric multimodal drop and dual-expert pseudo-label refinement to improve robustness under domain shifts.

[CVPR 2026] CCF: Complementary Collaborative Fusion for Domain Generalized Multi-Modal 3D Object Detection
Yuchen Wu, Kun Wang, Yining Pan, Na Zhao
[Project page]
- CCF targets domain-generalized multi-modal 3D object detection for autonomous driving.
- It rebalances camera and LiDAR queries with query-decoupled loss, LiDAR-guided depth priors, and complementary cross-modal masking.

[ICML 2025] How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation
Yining Pan, Qiongjie Cui, Xulei Yang, Na Zhao
[Project page]
- This paper proposes the Image-Assists-LiDAR (IAL) model, which harmonizes LiDAR and images through synchronized augmentation, token fusion, and prior query generation.
- IAL achieves SOTA performance on 3D panoptic benchmarks, outperforming baseline methods by over 4%.

[CVPR 2024] InstructVideo: Instructing Video Diffusion Models with Human Feedback
H. Yuan, S. Zhang, X. Wang, Y. Wei, T. Feng, Yining Pan, Y. Zhang, Z. Liu, S. Albanie, D. Ni
[Project page]
- InstructVideo is the first research attempt that instructs video diffusion models with human feedback.
- InstructVideo significantly enhances the visual quality of generated videos without compromising generalization capabilities, with merely 0.1% of the parameters being fine-tuned.

[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
H. Yuan, S. Zhang, X. Wang, S. Albanie, Yining Pan, T. Feng, J. Jiang, D. Ni, Y. Zhang, D. Zhao
- RLIPv2 elevates RLIP by leveraging a new language-image fusion mechanism, designed for expansive data scales.
🎖 Honors and Awards
- Best Oral Presentation Runner-up, Singapore ACM SIGKDD Symposium 2026
- Singapore International Graduate Award (SINGA) from A*STAR
- ZJU Graduate of Merit, Triple-A Graduate
- National Undergraduate Electronics Design Contest, First Prize
- China College Students’ ‘Internet+’ Innovation and Entrepreneurship Competition, First Prize
- Win the People Scholarship for three consecutive years (Top 1%)