I am a second-year master student at the School of AI, Beihang University (BUAA), supervised by Prof. Lei Sha.
My previous research focused on the safety alignment of LLM and VLM, and I am now seeking a PhD position for 2027 Fall.
π₯ News
- 2025.08: Β ππ Two papers (LARF and DIffusionAttacker) are accepted by EMNLP 2025 and DIffusionAttacker is selected as Oral Presentation.
- 2025.03: Β ππ Two papers (ActorBreaker and VLSBench) are accepted by ACL 2025 and ActorBreaker is selected as Outstanding Paper.
- 2024.09: Β ππ ASETF is accepted by EMNLP 2024 and selected as Oral Presentation.
π Publications
Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
Hao Li*, Lijun Li*, Zhenghao Lu, Xianyi Wei, Rui Li, Jing Shao, Lei Sha
DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak
Hao Wang, Hao Li, Junda Zhu, Xinyuan Wang, Chengwei Pan, Minlie Huang, Lei Sha
LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts
Qibing Ren*, Hao Li*, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao
VLSBench: Unveiling Visual Leakage in Multimodal Safety
Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao
ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings
Hao Wang*, Hao Li*, Minlie Huang, Lei Sha
π Preprints
-
Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT, Yesheng Liu, Hao Li, Haiyu Xu, Baoqi Pei, Jiahao Wang, Mingxuan Zhao, Jingshu Zheng, Zheqi He, JG Yao, Bowen Qin, Xi Yang, Jiajun Zhang
-
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement, CoAI & Lesca Group
-
SafeWork-R1: Coevolving Safety and Intelligence under the AI-45 Law, Shanghai AI Lab
π Educations
- 2024.09 - present, Master, Beihang University, Beijing.
- 2020.09 - 2024.06, Bachelor, Beihang University, Beijing.
π Selected Honors and Awards
- 2025, National Scholarship in China.
- 2023, Special Prize (Top 1) in βChallenge Cupβ Competition of Science Achievement in China.
π» Internships
- 2025.08 - present, VLM post-training & evaluation, BAAI, Beijing
- 2024.07 β 2025.07, LLM and VLM safety, Shanghai AI Lab, Beijing and Shanghai