I’m a Ph.D. student at State Key Lab of CAD&CG, Zhejiang University , under the supervision of Prof. Wei Chen.

I’m currently insterested in agents and (M)LLMs.

🔥 News

📝 Publications

ACL 2024

Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu

Fine-tuning LLMs for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities. In this work, we posit that the distribution gap between task datasets and the LLMs serves as the primary underlying cause. To address the problem, we introduce Self-Distillation Fine-Tuning to bridge the distribution gap by guiding fine-tuning with a distilled dataset generated by the model itself.

AAAI 2026 (Oral)

Zhaorui Yang*, Bo Pan*, Han Wang*, Yiyao Wang, Xingyu Liu, Luoxuan Weng, Yingchaojie Feng, Haozhe Feng, Minfeng Zhu, Bo Zhang, Wei Chen

Existing deep research works primarily focus on generating text-only content, leaving the automated generation of interleaved texts and visualizations underexplored. In this work, we propose a structured textual representation for visualizations, and introduce an agentic framework that automatically generates comprehensive multimodal reports from scratch with interleaved texts and visualizations.

arxiv:2304.06627

Haozhe Feng*, Zhaorui Yang*, Hesun Chen*, Tianyu Pang, Chao Du, Minfeng Zhu, Wei Chen, Shuicheng Yan

In this work, we investigate the mechanism of catastrophic forgetting of previous Source-Free Domain Adaptation (SFDA) approaches. We observe that there is a trade-off between adaptation gain and forgetting loss. Motivated by the findings, we propose CoSDA, which outperforms SOTA approaches in continuous adaptation.

2023.09 - 2028.07 (Expected)
Ph.D. student in Software Engineering at State Key Lab of CAD&CG, Zhejiang University
2019.09 - 2023.06
B.E. in Software Engineering, Xi’an Jiaotong University

2025.07 - 2026.03 (Expected): TEG, Tencent
Developed an LLM-powered Data Agent to automate Text-to-Insight analysis across enterprise data warehouses, enabling natural language queries for complex data exploration and business intelligence. Designed and implemented profiling algorithms to handle massive-scale tables with heterogeneous data types with semi-structured formats (JSON, ARRAY). Built end-to-end analytical pipeline encompassing schema linking, automated SQL generation, and Python-based data analysis, streamlining the pipeline from user intent to actionable insights.