About Me

I'm currently a third-year PhD student at CMU ECE, where I have the privilege to work with Prof. Matt Fredrikson.

Previously I'm a master student at Institute of Automation, Chinese Academy of Sciences, working with Prof. Liang Wang and Prof. Yan Huang.
Before this, I recieved my bachelor's degree from University of Chinese Academy of Sciences (UCAS). I also had a good time at the EECS, University of California, Berkeley.

I'm fancinated in robustness / attack / defenses in LLM/MLLM/Gen models.

Research

Check the full publication list at [Google Scholar]

LLM Generated Code Security

A Mixture of Linear Corrections Generates Secure Code.
Weichen Yu, Ravi Mangal, Terry Zhuo, Matt Fredrikson, Corina S Pasareanu.

[paper]

TL;DR: Using a bank of steering vectors to generate secure code.

PrivCode: When Code Generation Meets Differential Privacy
Zheng Liu, Chen Gong, Terry Yue Zhuo, Kecen Li, Weichen Yu, Matt Fredrikson, Tianhao Wang.

NDSS 2026 [paper]

TL;DR: Differentially private code generation with theoretical guarantees for protecting sensitive code.

LLM / Agent Jailbreak

Infecting LLM Agents via Generalizable Adversarial Attack
Weichen Yu, Kai Hu, Tianyu Pang, Chao Du, Min Lin, Matt Fredrikson.

NeurIPS 2024 RedTeaming Workshop (Oral) | COLM 2025. [paper]

TL;DR : With one intervention, let one malicious LLM agent compromise a group of malicious LLM agents.

Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization
Kai Hu*, Weichen Yu*, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson
NeurIPS 2024 (Poster)

[paper] [project]

TL;DR: White-box optimization based LLM jailbreak, which converges faster using dense-to-sparse projection.

Other LLM / VLM Attacks

The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition.
Xiaoze Liu, Weichen Yu, Matt Fredrikson, Xiaoqian Wang, Jing Gao.

[paper]

TL;DR: In model merging, when doing tokenizer transplant, we can inject "breaker tokens" into the merged model while preserving the original model.

Transferable adversarial attacks on black-box vision-language models
Kai Hu, Weichen Yu, Li Zhang, Alexander Robey, Andy Zou, Chengming Xu, Haoqi Hu, Matt Fredrikson

[paper]

TL;DR: Black-box adversarial attacks on vision-language models, successfully breaks GPT, Claude, etc.

Bag of Tricks for Training Data Extraction from Language Models.
Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, Bingyi Kang, Yan Huang, Min Lin, Shuicheng Yan.

International Conference on Machine Learning (ICML), 2023. | [paper] [project]

TL;DR: Extract training data from language models.

Computer Vision: Robustness

Is Your Text-to-Image Model Robust to Caption Noise?
Weichen Yu, Ziyan Yang, Shanchuan Lin, Qi Zhao, Jianyi Wang, Liangke Gui, Matt Fredrikson, Lu Jiang.

CVPR AI4C Workshop (Oral) [paper]

TL;DR: Address the noise (hallucination) in VLM generated captions when training text-to-image models.

Generalized Inter-class Loss for Gait Recognition.
Weichen Yu, Hongyuan Yu, Yan Huang, Liang Wang.
ACM Multimedia (MM), 2022 (Poster).

[paper] [project]

TL;DR: Loss design for shaping a more robust embedding space for gait recognition.

CNTN: Cyclic Noise-Tolerant Network.
Weichen Yu, Hongyuan Yu, Yan Huang, Chunshui Cao, Liang Wang.

[paper]

TL;DR: Algorithm design for video recognition in the presence of noise.

Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting.
Hongyuan Yu*, Ting Li *, Weichen Yu*, Jianguo Li, Yan Huang, Liang Wang, Alex Liu.
International Joint Conference on Artificial Intelligence (IJCAI), 2022. (Oral)

[paper] [project]

TL;DR:

Deconfounded Noisy Labels Learning.
Weichen Yu, Hongyuan Yu, Yan Huang, Jianghao Zhang, Qiang Liu, Liang Wang.

[paper]

TL;DR: Causal learning for noisy label learning.

Computer Vision: Performance

GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection
Xin Gao, Jiyao Liu, Guanghao Li, Yueming Lyu, Jianxiong Gao, Weichen Yu, Ningsheng Xu, Liang Wang, Caifeng Shan, Ziwei Liu, Chenyang Si.

NeurIPS, 2025 | [paper]

TL;DR: Use off-the-shelf in-distribution (ID) classifier to guide diffusion sampling trajectories towards OOD regions.

Sample-Aware RandAugment
Anqi Xiao, Weichen Yu, Hongyuan Yu.

International Journal of Computer Vision (IJCV), 2025 | [paper]

TL;DR: An effecitve data augmentation method for automaticaly finding the best data augmentation policies for each sample.

Competition

1st Learning and Mining with Noisy Labels Challenge, IJCAI-ECAI 2022.
runner-up of task 1-1 and 2nd runner-up of task 1-2. [details]
TL;DR:
Weichen Yu, Hongyuan Yu, Yan Huang, Dong An, Keji He, Zhipeng Zhang, Xiuchuan Li, Liang Wang

REVERIE Challenge 2022.
runner-up of channel 2. [details]
TL;DR:
Dong An, Yifeng Su, Shuanglin Sima, Hongyuan Yu, Weichen Yu, Yan Huang

Training Data Extraction Challenge
runner-up. Invited to give a talk in SaTML. [details]
TL;DR:
Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, Bingyi Kang, Yan Huang, Min Lin, Shuicheng Yan.

Undergraduate Research

  • SiC MOSFET in electromobile controlling
  • Advisor: Puqi Ning
    Genetic Algorithm Based SiC MOSFET On-state Resistance Modeling Method.
    Han C, Weichen Y, Xiaoguang C, Puqi N, Xuhui W.
    Journal of Power Supply, 2020, 18(04): 38-44. DOI:10.13234/j.issn.2095-2805.2020.4.38
    [paper]
  • Power supply noise analysis using autocorrelation
  • Advisor: Seth R. Sanders
    - developed better algorithms which combined autocorrelation with averaging on Simulink to analyze power supply noise and achieve high resolution (1 mV) at high frequency (30GHz).
    - Significantly reduced the cost from 500$ to 20$ and improved the performance and resolution by accumulating small steps of advancement such as the step of float-point to fixed-point optimization.

Education

2020/09 - 2023/07 : Master student in CS, Institute of Automation, Chinese Academy of Sciences.

2016/09 - 2020/07 : Bachelor in EE, University of Chinese Academy of Sciences.

2019/01 - 2019/08 : Visiting student in EECS, University of California, Berkeley.