Fanhu Zeng
I'm currently a Master student at Institute of Automation, Chinese Academy of Sciences (CASIA), supervised by Prof. Xu-Yao Zhang and Prof. Cheng-Lin Liu. My major is Pattern Recognition and Intelligent System.
Before that, I obtained my Bachlor degree from Nanjing University of Aeronautics and Astronautics (NUAA) majoring in Automation.
My research interests lie in:
     1. Trustworthy AI
- Uncertainty Estimation
- Out-of-distribution Detection
- Continual Learning
     2. Multimodal Large Language models
- Prompt Learning/LoRA
- Downstream Application
     3. Efficient AI
- Model Merging
- Token Compression/Model Acceleration
- Image Compression
Particularly the up-to-date combination of Multimodal Large Language Models with Trustworthy AI like Hallucination, Continual Instruction Tuning and so on.
Email /
Google Scholar /
Github /
LinkedIn
📣
I am looking for PhD opportunities for these directions starting in 25Fall, 26Spring or 26Fall. I would greatly appreciate it if you have any available positions or suggestions.
|
|
News
-
[2025.05] One paper (FCIT) is accepted to ICCV 2025. Congratulations to all my collaborators!
-
[2025.05] Two papers (HiDe-LLaVA, ChartEdit) are accepted to ACL 2025. Congratulations to all my collaborators!
-
[2025.01] One paper (Local-Prompt) is accepted to ICLR 2025.
|
Publications
* indicates equal contribution
|
|
Federated Continual Instruction Tuning
Haiyang Guo, Fanhu Zeng, Fei Zhu, Wenzhuo Liu, Da-Han Wang, Jian Xu, Xu-Yao Zhang, Cheng-Lin Liu
International Conference on Computer Vision (ICCV), 2025
arXiv / Code
|
|
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model
Haiyang Guo*, Fanhu Zeng*, Ziwei Xiang, Fei Zhu, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025
arXiv / Code
|
|
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection
Fanhu Zeng, Zhen Cheng, Fei Zhu, Hongxin Wei, Xu-Yao Zhang
The Thirteenth International Conference on Learning Representations (ICLR), 2025
Paper / arXiv / Code
|
|
EventVAD: Training-Free Event-Aware Video Anomaly Detection
Yihua Shao, Haojin He, Sijie Li, Siyu Chen, Xinwei Long, Fanhu Zeng, Yuxuan Fan, Muyang Zhang, Ziyang Yan, Ao Ma, Xiaochen Wang, Hao Tang, Yan Wang, Shuyan Li
The 33rd ACM International Conference on Multimedia (ACM MM), 2025
arXiv / Code
|
|
ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing
Xuanle Zhao*, Xuexin Liu*, Haoyue Yang*, Xianzhen Luo, Fanhu Zeng, Jianling Li, Qi Shi, Chi Chen
Findings of the Association for Computational Linguistics (ACL), 2025
arXiv / Code
|
|
M2M-TAG: Training-Free Many-to-Many Token Aggregation for Vision Transformer Acceleration
Fanhu Zeng, Deli Yu
Workshop on Machine Learning and Compression, Neural Information Processing Systems (NeurIPS), 2024
Paper
|
|
A Comprehensive Survey on Continual Learning in Generative Models
Haiyang Guo, Fanhu Zeng, Fei Zhu, Jiayi Wang, Xukai Wang, Jingang Zhou, Hongbo Zhao, Wenzhuo Liu, Shijie Ma, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu
arXiv / Code
|
|
Token Reduction Should Go Beyond Efficiency in Generative Models – From Vision, Language to Multimodality
Zhenglun Kong*, Yize Li*, Fanhu Zeng, Lei Xin, Shvat Messica, Xue Lin, Pu Zhao, Manolis Kellis, Hao Tang, Marinka Zitnik
arXiv / Code
|
|
TR-DQ: Time-Rotation Diffusion Quantization
Yihua Shao, Deyang Lin, Fanhu Zeng, Minxi Yan, Muyang Zhang, Siyu Chen, Yuxuan Fan, Ziyang Yan, Haozhe Wang, Jingcai Guo, Yan Wang, Haotong Qin, Hao Tang
arXiv
|
|
Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models
Fanhu Zeng, Zhen Cheng, Fei Zhu, Xu-Yao Zhang
arXiv
|
|
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Fanhu Zeng, Haiyang Guo, Fei Zhu, Li Shen, Hao Tang
arXiv
|
|
Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning
Haiyang Guo, Fei Zhu, Fanhu Zeng, Bing Liu, Xu-Yao Zhang
arXiv
|
|
ModalPrompt: Dual-Modality Guided Prompt for Continual Learning of Large Multimodal Models
Fanhu Zeng, Fei Zhu, Haiyang Guo, Xu-Yao Zhang, Cheng-Lin Liu
arXiv
|
|
PPT: Token Pruning and Pooling for Efficient Vision Transformers
Xinjian Wu, Fanhu Zeng, Xiudong Wang, Xinghao Chen
arXiv / Code
|
Academic Services
Conference Reviewer: EMNLP, NeurIPS, ICLR, CVPR, ICCV
|
© Fanhu Zeng | Last updated: July, 2025
| |