📝 Publications
A full publication list is available on my google scholar page.
ICML 2025

[ICML 2025] How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation
Yining Pan, Qiongjie Cui, Xulei Yang, Na Zhao
[Project page]
- This paper proposes the Image-Assists-LiDAR (IAL) model, which harmonizes LiDAR and images through synchronized augmentation, token fusion, and prior query generation.
- IAL achieves SOTA performance on 3D panoptic benchmarks, outperforming baseline methods by over 4%.
CVPR 2024

[CVPR 2024] InstructVideo: Instructing Video Diffusion Models with Human Feedback
H. Yuan, S. Zhang, X. Wang, Y. Wei, T. Feng, Yining Pan, Y. Zhang, Z. Liu, S. Albanie, D. Ni
[Project page]
- InstructVideo is the first research attempt that instructs video diffusion models with human feedback.
- InstructVideo significantly enhances the visual quality of generated videos without compromising generalization capabilities, with merely 0.1% of the parameters being fine-tuned.
ICCV 2023

[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
H. Yuan, S. Zhang, X. Wang, S. Albanie, Yining Pan, T. Feng, J. Jiang, D. Ni, Y. Zhang, D. Zhao
- RLIPv2 elevates RLIP by leveraging a new language-image fusion mechanism, designed for expansive data scales.