π About Me
I am Yuyang Ding (δΈθͺζ΄), a third-year Ph.D. student at the Institute of Artificial Intelligence, Soochow University, advised by Assoc. Prof. Juntao Li and Prof. Min Zhang.
My research lies in LLM Reasoning, with particular interests in reinforcement learning, test-time scaling, and robust learning.
I am currently a research intern at Seed-Infrastructures, ByteDance, contributing to verl, a reinforcement learning framework for LLMs. My research focuses on the joint optimization of algorithms and infrastructure to enable scalable and efficient reinforcement learning.
π Publications
* denotes equal contribution.

FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning
Yuyang Ding, Chi Zhang, Juntao Li, Haibin Lin, Xin Liu, Min Zhang
TL;DR: We propose Flawed-Aware Policy Optimization (FAPO), which penalizes flawed patterns to achieve more efficient and reliable reinforcement learning.

SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning
Yuyang Ding, Xinyu Shi, Juntao Li, Xiaobo Liang, Zhaopeng Tu, Min Zhang
TL;DR: We propose Self-Denoising Monte Carlo Annotation (SCAN), an efficient Process Reward Model (PRM) data synthesis and noise-tolerant learning framework.

ScaleQuest: Unleashing LLM Reasoning Capability via Scalable Question Synthesis from Scratch
Yuyang Ding, Xinyu Shi, Xiaobo Liang, Juntao Li, Zhaopeng Tu, Qiaoming Zhu, Min Zhang
TL;DR: We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.

DS-NER: Unveiling and Addressing Latent Noise in Distant Annotations
Yuyang Ding, Dan Qiao, Juntao Li, Jiajie Xu, Pingfu Chao, Xiaofang Zhou, Min Zhang
TL;DR: We investigated the noise distribution in distantly supervised annotations and proposed targeted denoising and robust training strategies.

GNER: Rethinking Negative Instances for Generative Named Entity Recognition
Yuyang Ding, Juntao Li, Pinzheng Wang, Zecheng Tang, Bowen Yan, Min Zhang
TL;DR: We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero-shot capabilities across unseen entity domains.
-
COLING 2022SelfMix: Robust Learning Against Textual Label Noise with Self-Mixup Training,Dan Qiao*, Chenchen Dai*, Yuyang Ding*, Juntao Li, Qiang Chen, Wenliang Chen, Min Zhang
-
SCIS (CCF-A)OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch,Juntao Li*, Zecheng Tang*, Yuyang Ding*, Pinzheng Wang*, Pei Guo, Wangjie You, Dan Qiao, Wenliang Chen, Guohong Fu, Qiaoming Zhu, Guodong Zhou, Min Zhang
-
EMNLP 2023CMD: a framework for Context-aware Model self-Detoxification,Zecheng Tang, Keyan Zhou, Juntao Li, Yuyang Ding, Pinzheng Wang, Yan Bowen, Renjie Hua, Min Zhang
π Honors and Awards
- CCF Elite Collegiate Award
- ICPC National Invitational Programming Contest, Gold Medal
- ICPC Asia-East Continent Final Contest (EC-Final), Silver Medal
π Educations
- 2023.09 - current, PhD Student, Institute of Artificial Intelligence, Soochow University
- 2019.09 - 2023.06, B.Eng., School of Computer Science and Technology, Soochow University
π» Internships
- 2025.06 - current, Research Intern, ByteDance Seed, Shanghai, China,