I am Pingzhi Tang, an undergraduate student in the General Artificial Intelligence Experimental Program (Tong Class) at Peking University.
With a major GPA of 3.92/4.00, my research centers on making foundation models scalable and accessible by improving their computational efficiency. I’m particularly interested in next-gen model architectures, efficient inference, and reasoning & reinforcement learning.
Beyond the academia, I am a photography enthusiast📸 and film lover🎞️.
Tong Class.
Major GPA: 3.92/4.00.
* See CV for full list of awards.
Advisor: Prof. Muhan Zhang.
Working on efficient inference, model architecture, PEFT and LLM reasoning.
Addressed KV cache overhead in MLA tensor parallelism. Achieved 2x inference acceleration.
Fanxu Meng*, Pingzhi Tang*, Zengwei Yao, Xing Sun, Muhan Zhang
Proposed a method to transform standard multi-head attention into multi-head latent attention, achieving up to 11x inference acceleration with negligible loss.
Fanxu Meng, Pingzhi Tang, Fan Jiang, Muhan Zhang
Developed a novel large model pruning and fine-tuning method using cross-layer singular value decomposition.
Yiding Wang, Fanxu Meng, Xuefeng Zhang, Fan Jiang, Pingzhi Tang, Muhan Zhang
Invented a distributed PEFT method for LLMs that increases update flexibility by assigning different weight components to each GPU.
For a complete list of publications, please visit my Google Scholar or CV.
Coming soon.