I am a second-year master student at the School of AI, Beihang University (BUAA), supervised by Prof. Lei Sha. I was honored to be advised by Jing Shao, JG Yao, and Junxian He.

My previous research focused on the safety alignment of AI and long-horizon reasoning, and I am now seeking a PhD position for 2027 Fall.

🔥 News

2026.04: 🎉🎉 SSP is accepted by ACL 2026 Findings.
2026.02: 🎉🎉 ReVeL is accepted by CVPR 2026.
2025.08: 🎉🎉 Two papers (LARF and DIffusionAttacker) are accepted by EMNLP 2025 and DIffusionAttacker is selected as Oral Presentation.
2025.03: 🎉🎉 Two papers (ActorBreaker and VLSBench) are accepted by ACL 2025 and ActorBreaker is selected as Outstanding Paper.
2024.09: 🎉🎉 ASETF is accepted by EMNLP 2024 and selected as Oral Presentation.

📝 Publications

ACL 2026 Findings

Be Your Own Red Teamer: Safety Alignment via Self-Play and Reflective Experience Replay

Hao Wang, Yanting Wang, Hao Li, Rui Li, Lei Sha

CVPR 2026

Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT

Yesheng Liu, Hao Li, Haiyu Xu, Baoqi Pei, Jiahao Wang, Mingxuan Zhao, Jingshu Zheng, Zheqi He, JG Yao, Bowen Qin, Xi Yang, Jiajun Zhang

EMNLP 2025

Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment

Hao Li*, Lijun Li*, Zhenghao Lu, Xianyi Wei, Rui Li, Jing Shao, Lei Sha

EMNLP 2025 Oral

DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak

Hao Wang, Hao Li, Junda Zhu, Xinyuan Wang, Chengwei Pan, Minlie Huang, Lei Sha

ACL 2025 Outstanding Paper

LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts

Qibing Ren*, Hao Li*, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao

ACL 2025

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao

EMNLP 2024 Oral

ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings

Hao Wang*, Hao Li*, Minlie Huang, Lei Sha

📝 Preprint

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment, Hao Li*, Jingkun An*, Zijun Song*, Pengyu Zhu, Rui Li, Hao Wang, Wendi Feng, Yesheng Liu, Lijun Li, Jin-Ge Yao, Lei Sha
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement, CoAI & Lesca Group
SafeWork-R1: Coevolving Safety and Intelligence under the AI-45 Law, Shanghai AI Lab

📖 Educations

2024.09 - present, Master, Beihang University, Beijing.
2020.09 - 2024.06, Bachelor, Beihang University, Beijing.

🎖 Selected Honors and Awards

2025, National Scholarship in China.
2023, Special Prize (Top 1) in “Challenge Cup” Competition of Science Achievement in China.

🧩 Academic Services

Conference Review: ACL, EMNLP, NAACL
Workshop Challenge Organizer: Trustworthy Multi-modal Foundation Models and AI Agents (TiFA) in ICML 2024.

💻 Internships

2026.03 - 2026.06, Agentic RL, TikTok AI Innovation Center
2025.08 - 2026.01, VLM post-training & evaluation, BAAI
2024.07 – 2025.07, LLM and VLM safety, Shanghai AI Lab