Undergraduate Student at Peking University (Tong Class).
Advancing Efficient LLM Architectures.
Occasional observer of the world's quiet moments.
I am Pingzhi Tang, an undergraduate student in the General Artificial Intelligence Experimental Program (Tong Class) at Peking University.
With a major GPA of 3.92/4.00, my research focuses on enhancing the performance and efficiency of large-scale models, specifically in model architecture design and efficient inference.
Beyond the code, I am a photography enthusiast and film lover. I find that the intuition used to optimize a neural network often resonates with the process of composing a frame—both are pursuits of finding structure within complexity.
Tong Class.
Major GPA: 3.92/4.00.
* See CV for full list of awards.
Advisor: Prof. Muhan Zhang.
Working on efficient inference, model architecture, PEFT and LLM reasoning.
Addressed KV cache overhead in MLA tensor parallelism. Achieved 2x inference acceleration.
Fanxu Meng*, Pingzhi Tang*, Zengwei Yao, Xing Sun, Muhan Zhang
Proposed a method to transform standard multi-head attention into multi-head latent attention, achieving up to 11x inference acceleration with negligible loss.
Yiding Wang, Fanxu Meng, Xuefeng Zhang, Fan Jiang, Pingzhi Tang, Muhan Zhang
Invented a distributed PEFT method for LLMs that increases update flexibility by assigning different weight components to each GPU.
Fanxu Meng, Pingzhi Tang, Fan Jiang, Muhan Zhang
Developed a novel large model pruning and fine-tuning method using cross-layer singular value decomposition.
For a complete list of publications, please visit my Google Scholar or CV.
Coming soon.