Publications
                        Mixture of Hidden-Dimensions Transformer
                        , Junyuan Shang, Zhengyu Zhang, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang
                        ICML'25 | International Conference on Machine Learning
                        pdf
                        abstract
                        
                        cite
                    
                        Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
                        , Junyuan Shang, Zhenyu Zhang, Yanxi Xie, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang
                        
                        ACL'25 | Annual Meeting of the Association for Computational Linguistics
                        pdf
                        abstract
                        
                        cite
                    
                     DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
                     , Linhao Zhang∗, Junyuan Shang, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun><
                      (* = Equal Contribution)
                     
                     NeurIPS'24 | Annual Conference on Neural Information Processing Systems
                     pdf
                     abstract
                     video
                     cite
                     
                     
                 
                     LEMON: Reviving Stronger and Smaller LMs from Larger LMs with Linear Parameter Fusion
                     , Junyuan Shang, Zhenyu Zhang, Shiyao Cui, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu
                     
                     ACL'24 Oral | Annual Meeting of the Association for Computational Linguistics
                     pdf
                     abstract
                     video
                     cite
                 
                        NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time
                        , Guoxia Wang∗, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu
                         (* = Equal Contribution)
                        ACL'24 | Annual Meeting of the Association for Computational Linguistics
                        pdf
                        abstract
                        video
                        cite
                    
                        MoR: Mixture of Ranks for Low-Rank Adaptation Tuning
                        Chuanyu Tang∗,, Zhenyu Zhang, Junyuan Shang, Wenyuan Zhang, Yong Huang, Tingwen Liu
                         (* = Equal Contribution)
                        Arxiv | Preprint
                        pdf
                        abstract
                        
                        cite
                    
                        Improving Adaptive Knowledge Graph Construction via Large Language Models with Multiple Views
                        , Shiyao Cui, Kun Huang, Shicheng Wang, Chuanyu Tang, Tingwen Liu, Binxing Fang
                        
                        CCKS'23 | China Conference on Knowledge Graph and Semantic Computing
                        pdf
                        abstract
                        
                        cite
                    
                        FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity
                        Shiyao Cui, Zhenyu Zhang, , Wenyuan Zhang, Tianyun Liu, Siqi Wang, Tingwen Liu
                        
                        KDD'24 Workshop | ACM SIGKDD Conference on Knowledge Discovery and Data Mining
                        pdf
                        abstract
                        
                        cite
                    
Talks
                        LEMON: Reviving Stronger and Smaller LMs from Larger LMs with Linear Parameter Fusion
                        In the context of increasing attention on cost-effective and flexible deployment of smaller language models, LEMON enhances smaller models by fusing parameters from larger models, creating strong initial points for training. This approach uses layer and dimension operators along with controllable receptive fields, enabling transformation to smaller scales.
                        CentralWorld ,Bangkok, Thailand
                        video
                        
                            
                    
                     General Information Extraction based on Instruction Fine-tuning LLMs (In Chinese)
                     Introduce some of my work in the direction of Information Extraction by instruction fine-tuning LLMs. 
                     The constructed IE LLMs has good performance and  generalization, providing a foundation for constructing KGs.
                     University of Chinese Academy of Sciences, Beijing, China
                     video
                     
                         
                 
Competitions
                        Champion in Instruction-driven Adaptive Knowledge Graph Construction (1/664)
                        CCKS'23 | Sponsored by Chinese Information Processing Society of China (CIPS)
                        An competition on Tianchi platform for adaptive knowledge graph construction and inductive relation reasoning using large models like ChatGPT.
                        
                        
                            
                    
                           Champion in Evaluation of Large Language Models (1/481)
                           LIC'23 | Sponsored by China Computer Federation (CCF) and Chinese Information Processing Society of China (CIPS)
                           Propose a harmlessness evaluation system for the development of Factuality, Fairness, Toxicity LLMs, deeply evaluate the multi-dimensional security of LLMs and provide suggestions.
                           
                           
                               
                       
Honours and Awards
                        Graduate National Scholarship (Top 1%)
                        Annually awarded to 10,000 Phd students in China
                        Recognizes graduate students for outstanding academic and research achievements.
                    
                        Zhu Li Yuehua Scholarship (Top 1%)
                        Established to support and reward outstanding graduate students in China
                        Recognizes academic excellence, research potential, and contributions to society.
                    
                        Excellent Graduation Project of Beijing University of Posts and Telecommunications (Top 0.5%)
                        Data Mining for Controversial Detection. 2022. advised by Dr.Kleomenis Katevas 
 
                        Introduce RoBERTa combined with CNN to build three classification models and proposed a semi-supervised controversy detection application to optimize data annotation work.
                    
                        Beijing Outstanding Graduate (Top 5%)
                        Awarded in 2022 by the Beijing Education Commission 
 
                        Ranked among the top 5% of all graduates in Beijing based on comprehensive evaluation.
                    
 
                         
                         
                         
                         
                        