I am now a second-year PhD student in IMPL Lab, Singapore University of Technology and Design (SUTD), fortunately supervised by Prof. Na Zhao. I am also supported by the Agency for Science, Technology and Research (A*STAR) SINGA scholarship and am grateful to be supervised by Prof. Xulei Yang. Prior to this, I obtained my Master’s degree from Zhejiang University in 2023 and worked as a research intern at Alibaba DAMO Academy.

My research interests include multi-modal scene understanding and generation. Currently, I focus on building a comprehensive understanding of complex scenes by leveraging multi-modal features (e.g., LiDAR and RGB images). I am also interested in transferring learned knowledge to address real-world challenges such as domain shift.

🔥 News

  • 2025.05:  🎉🎉🎉 One paper is accepted by ICML 2025!
  • 2025.05:  My new homepage is now live!

📝 Publications

A full publication list is available on my google scholar page.

ICML 2025
sym

[ICML 2025] How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation
Yining Pan, Qiongjie Cui, Xulei Yang, Na Zhao

  • This paper proposes the Image-Assists-LiDAR (IAL) model, which harmonizes LiDAR and images through synchronized augmentation, token fusion, and prior query generation.
  • IAL achieves SOTA performance on 3D panoptic benchmarks, outperforming baseline methods by over 4%.
CVPR 2024
sym

[CVPR 2024] InstructVideo: Instructing Video Diffusion Models with Human Feedback
H. Yuan, S. Zhang, X. Wang, Y. Wei, T. Feng, Yining Pan, Y. Zhang, Z. Liu, S. Albanie, D. Ni
GitHub Stars GitHub Forks [Project page]

  • InstructVideo is the first research attempt that instructs video diffusion models with human feedback.
  • InstructVideo significantly enhances the visual quality of generated videos without compromising generalization capabilities, with merely 0.1% of the parameters being fine-tuned.
ICCV 2023
sym

[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
H. Yuan, S. Zhang, X. Wang, S. Albanie, Yining Pan, T. Feng, J. Jiang, D. Ni, Y. Zhang, D. Zhao
GitHub Stars GitHub Forks

  • RLIPv2 elevates RLIP by leveraging a new language-image fusion mechanism, designed for expansive data scales.

🎖 Honors and Awards

  • Singapore International Graduate Award (SINGA) from A*STAR.
  • ZJU Graduate of Merit, Triple-A Graduate.
  • National Undergraduate Electronics Design Contest, First Prize.
  • China College Students’ ‘Internet+’ Innovation and Entrepreneurship Competition, First Prize.
  • Win the People Scholarship for three consecutive years (Top 1%).