Pingzhi Tang

Undergraduate Student at Peking University (Tong Class).

View Research Download CV

About Me

stanleytang [at] stu.pku.edu.cn

github.com/Stanleytowne Google Scholar

18258276580

I am Pingzhi Tang, an undergraduate student in the General Artificial Intelligence Experimental Program (Tong Class) at Peking University.

With a major GPA of 3.92/4.00, my research centers on making foundation models scalable and accessible by improving their computational efficiency. I’m particularly interested in next-gen model architectures, efficient inference, and reasoning & reinforcement learning.

Beyond the academia, I am a photography enthusiast📸 and film lover🎞️.

Education

Peking University

B.S. in Artificial Intelligence

Sept 2023 - Present

Tong Class.
Major GPA: 3.92/4.00.

Honors

Dean's Scholarship (Top 2%, 2025)
Taotian Scholarship (Only 4 recipients annually, 2025)
Second-Class Scholarship (Top 5%, 2024 and 2025)
Outstanding Student Honor (2024 and 2025)
Soong Ching Ling "Future Scholarship" (Top 5%, 2024)

* See CV for full list of awards.

Experience

Mμ Lab, Peking University
Undergraduate Researcher

July 2024 - Present

Advisor: Prof. Muhan Zhang.
Working on efficient inference, model architecture, PEFT and LLM reasoning.
Youtu Lab, Tencent

Research Intern

June 2025 - Sept 2025

Addressed KV cache overhead in MLA tensor parallelism. Achieved 2x inference acceleration.

Selected Publications

TransMLA: Multi-Head Latent Attention Is All You Need

NeurIPS 2025 (Spotlight)

Fanxu Meng*, Pingzhi Tang*, Zengwei Yao, Xing Sun, Muhan Zhang

Proposed a method to transform standard multi-head attention into multi-head latent attention, achieving up to 11x inference acceleration with negligible loss.

CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning

ICML 2025

Fanxu Meng, Pingzhi Tang, Fan Jiang, Muhan Zhang

Developed a novel large model pruning and fine-tuning method using cross-layer singular value decomposition.

HD-PISSA: High-Rank Distributed Orthogonal Adaptation

EMNLP 2025 (Oral)

Yiding Wang, Fanxu Meng, Xuefeng Zhang, Fan Jiang, Pingzhi Tang, Muhan Zhang

Invented a distributed PEFT method for LLMs that increases update flexibility by assigning different weight components to each GPU.

For a complete list of publications, please visit my Google Scholar or CV.

Gallery

Enter

Writings

Coming soon.