Fanhu Zeng

I'm currently a Master student at Institute of Automation, Chinese Academy of Sciences (CASIA), supervised by Prof. Xu-Yao Zhang and Prof. Cheng-Lin Liu. My major is Pattern Recognition and Intelligent System.

Before that, I obtained my Bachlor degree from Nanjing University of Aeronautics and Astronautics (NUAA) majoring in Automation.

My research interests lie in:
     1. Trustworthy AI

  • Uncertainty Estimation
  • Out-of-distribution Detection
  • Continual Learning
     2. Multimodal Large Language models
  • Prompt Learning/LoRA
  • Downstream Application
     3. Efficient AI
  • Model Merging
  • Token Compression/Model Acceleration
  • Image Compression

Particularly the up-to-date combination of Multimodal Large Language Models with Trustworthy AI like Hallucination, Continual Instruction Tuning and so on.

Email  /  Google Scholar  /  Github  /  LinkedIn

📣 I am looking for PhD opportunities for these directions starting in 25Fall, 26Spring or 26Fall. I would greatly appreciate it if you have any available positions or suggestions.

profile photo

Education

News

  • [2025.08] One paper (ModalPrompt) is accepted to EMNLP 2025!!
  • [2025.06] One paper (FCIT) is accepted to ICCV 2025. Congratulations to all my collaborators!
  • [2025.05] Two papers (HiDe-LLaVA, ChartEdit) are accepted to ACL 2025. Congratulations to all my collaborators!
  • [2025.02] One paper (MambaIC) is accepted to CVPR 2025!!
  • [2025.01] One paper (Local-Prompt) is accepted to ICLR 2025.

Publications

* indicates equal contribution

dise ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt
Fanhu Zeng, Fei Zhu, Haiyang Guo, Xu-Yao Zhang, Cheng-Lin Liu
The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
arXiv / Code
dise Federated Continual Instruction Tuning
Haiyang Guo, Fanhu Zeng, Fei Zhu, Wenzhuo Liu, Da-Han Wang, Jian Xu, Xu-Yao Zhang, Cheng-Lin Liu
International Conference on Computer Vision (ICCV), 2025
arXiv / Code
dise HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model
Haiyang Guo*, Fanhu Zeng*, Ziwei Xiang, Fei Zhu, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Paper / arXiv / Code
dise MambaIC: State Space Models for High-Performance Learned Image Compression
Fanhu Zeng, Hao Tang, Yihua Shao, Siyu Chen, Ling Shao, Yan Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
Paper / arXiv / Code
dise Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection
Fanhu Zeng, Zhen Cheng, Fei Zhu, Hongxin Wei, Xu-Yao Zhang
The Thirteenth International Conference on Learning Representations (ICLR), 2025
Paper / arXiv / Code
dise M2M-TAG: Training-Free Many-to-Many Token Aggregation for Vision Transformer Acceleration
Fanhu Zeng, Deli Yu
Workshop on Machine Learning and Compression, Neural Information Processing Systems (NeurIPS), 2024
Paper
dise MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark
Haiyang Guo, Fei Zhu, Hongbo Zhao, Fanhu Zeng, Wenzhuo Liu, Shijie Ma, Da-Han Wang, Xu-Yao Zhang
Workshop on Multimodal Continual Learning, International Conference on Computer Vision (ICCV), 2025
arXiv / Code
dise EventVAD: Training-Free Event-Aware Video Anomaly Detection
Yihua Shao, Haojin He, Sijie Li, Siyu Chen, Xinwei Long, Fanhu Zeng, Yuxuan Fan, Muyang Zhang, Ziyang Yan, Ao Ma, Xiaochen Wang, Hao Tang, Yan Wang, Shuyan Li
The 33rd ACM International Conference on Multimedia (ACM MM), 2025
arXiv / Code
dise ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing
Xuanle Zhao*, Xuexin Liu*, Haoyue Yang*, Xianzhen Luo, Fanhu Zeng, Jianling Li, Qi Shi, Chi Chen
Findings of the Association for Computational Linguistics (ACL), 2025
Paper / arXiv / Code

Preprints

dise Continual Learning for Generative AI: From LLMs to MLLMs and Beyond
Haiyang Guo, Fanhu Zeng, Fei Zhu, Jiayi Wang, Xukai Wang, Jingang Zhou, Hongbo Zhao, Wenzhuo Liu, Shijie Ma, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu
arXiv / Code
dise Token Reduction Should Go Beyond Efficiency in Generative Models – From Vision, Language to Multimodality
Zhenglun Kong*, Yize Li*, Fanhu Zeng, Lei Xin, Shvat Messica, Xue Lin, Pu Zhao, Manolis Kellis, Hao Tang, Marinka Zitnik
arXiv / Code
dise TR-DQ: Time-Rotation Diffusion Quantization
Yihua Shao, Deyang Lin, Fanhu Zeng, Minxi Yan, Muyang Zhang, Siyu Chen, Yuxuan Fan, Ziyang Yan, Haozhe Wang, Jingcai Guo, Yan Wang, Haotong Qin, Hao Tang
arXiv
dise Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models
Fanhu Zeng, Zhen Cheng, Fei Zhu, Xu-Yao Zhang
arXiv
dise Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Fanhu Zeng, Haiyang Guo, Fei Zhu, Li Shen, Hao Tang
arXiv
dise Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning
Haiyang Guo, Fei Zhu, Fanhu Zeng, Bing Liu, Xu-Yao Zhang
arXiv
dise PPT: Token Pruning and Pooling for Efficient Vision Transformers
Xinjian Wu, Fanhu Zeng, Xiudong Wang, Xinghao Chen
arXiv / Code

Academic Services

  • Conference Reviewer: EMNLP, NeurIPS, ICLR, CVPR, ICCV

  • Website Template


    © Fanhu Zeng | Last updated: August, 2025