Hello, welcome to my page! 🎉🎉
My name is Huihong Shi, and I am currently a 5th-year Ph.D. student in the School of Electronic Science and Engineering at Nanjing University, under the supervision of Prof. Zhongfeng Wang (IEEE Fellow). From April 2021 to December 2022, I was a visiting student at the EIC Lab at Rice University, supervised remotely by Prof. Yingyan Lin. I also visited the EIC Lab at the Georgia Institute of Technology in person from December 2023 to May 2024.
My research interests include holistic and efficient machine learning systems via algorithm and hardware co-design.
🔥🔥 I am actively seeking Postdoc opportunities starting in Fall 2025! My CV can be viewed and downloaded here!
📖 Educations
-
2020.09 - now, Nanjing University, School of Electronic Science and Engineering
Ph.D. student supervised by Prof. Zhongfeng Wang (IEEE Fellow) -
2023.12 - 2024.05, Georgia Institute of Technology, School of Computer Science
Vsiting student supervised by Prof. Yingyan (Celine) Lin -
2021.03 - 2022.12, Rice University (Remote), Schole of Electrical and Computer Engineering
Visiting student supervised by Prof. Yingyan (Celine) Lin -
2016.09 - 2020.06, Jilin University, School of Communication Engineering, Bachelor (Top 2%)
📝 Publications
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer.
Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang
[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2024)
[Abstract] We propose Trio-ViT to marry the hardware efficiency of both quantization and efficient Vision Transformer (ViT) architectures, which (1) eliminate the troublesome Softmax and (2) integrate linear attention with low computational complexity.
An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT.
Haikuo Shao, Huihong Shi, Wendong Mao, Zhongfeng Wang
[Conference] IEEE International Symposium on Circuits and Systems (ISCAS 2024)
[Abstract] We propose an FPGA-based accelerator for EfficientViT to advance the hardware efficiency frontier of ViTs. Specifically, we design a reconfigurable architecture to efficiently support various operation types, including lightweight convolutions and attention, boosting hardware utilization. Additionally, we present a time-multiplexed and pipelined dataflow to facilitate both intra- and inter-layer fusions, reducing off-chip data access costs.
NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models.
Yang Xu*, Huihong Shi*, Zhongfeng Wang (*Co-first Authors)
[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2024)
[Abstract] We propose a Neural Architecture and Accelerator Co-Search framework (NASH) for multiplication-reduced hybrid models. We introduce a tailored zero-shot metric for architecture search to pre-identify promising models, enhancing efficiency and reducing gradient conflicts. For accelerator search, we use a coarse-to-fine approach to streamline the process. By integrating both searches, NASH achieves optimal model and accelerator pairing.
Huihong Shi, Xin Cheng, Wendong Mao, Zhongfeng Wang
[Journal] IEEE Transactions on Very Large Scale Integration Systems (TVLSI 2024)
[Abstract] We propose a Power-of-Two (PoT) post-training quantization and acceleration framework (P$^2$-ViT) for Vision Transformers (ViTs). We first propose a dedicated quantization scheme with PoT scaling factors to minimize re-quantization overhead and further develop a dedicated accelerator to enhance throughput.
NASA-F: FPGA-Oriented Search and Acceleration for Multiplication-Reduced Hybrid Networks.
Huihong Shi, Yang Xu, Wendong Mao, Zhongfeng Wang
[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2024)
[Abstract] We propose an FPGA-oriented search and acceleration framework called NASA-F for multiplication-reduced hybrid models. Specifically, we fully leverage the diverse hardware resources (DSPs and LUTs) available on FPGAs to accelerate heterogeneous layers in hybrid models, aiming to enhance both hardware utilization and throughput.
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer.
Haoran You*, Huihong Shi*, Yipin Guo*, Zhongfeng Wang (*Co-first Authors)
[Conference] Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)
[Abstract] We reparameterize pre-trained Vision Transformers (ViTs) with a mixture of multiplication primitives, such as bitwise shifts and additions, to obtain ShiftAddViT. This approach aims to achieve end-to-end inference speedups on GPUs without requiring training from scratch.
Jyotikrishna Dass*, Shang Wu*, Huihong Shi*, Chaojian Li, Zhifan Ye, Zhongfeng Wang, Yingyan Lin (*Co-first Authors)
[Conference] International Symposium on High-Performance Computer Architecture (HPCA 2023)
[Abstract] We propose VITALITY to boost the inference efficiency of Vision Transformers (ViTs) via algorithm-hardware co-design. We first approximate the vanilla softmax with first-order Taylor attention for linear complexity and unifies low-rank and sparse components to enhance accuracy. We further develop a dedicated accelerator that leverages the linearized workload to improve hardware efficiency.
NASA$+$: Neural Architecture Search and Acceleration for Multiplication-Reduced Hybrid Networks.
Huihong Shi, Haoran You, Zhongfeng Wang, Yingyan Lin
[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2023)
[Abstract] Inspired by the fact that multiplications can be mathematically decomposed into bit-wise shifts and additions, we design reconfigurable PEs to simultaneously support multiplication-based convolutions, shift, and adder layers, thus enhancing the flexibility of our accelerator.
NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks.
Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin
[Conference] International Conference on Computer-Aided Design (ICCAD 2022)
[Abstract] We propose a Neural Architecture Search and Acceleration framework (NASA) to enable automated search and acceleration of multiplication-reduced models, aiming to marry the powerful performance of multiplication-based models and the hardware efficiency of multiplication-free models.
LITNet: A Light-weight Image Transform Net for Image Style Transfer.
Huihong Shi, Wendong Mao, Zhongfeng Wang
[Conference] International Joint Conference on Neural Networks (IJCNN 2021)
[Abstract] We propose a compression algorithm for one of the influential CNN-based style transfer networks, named Image Transform Net (ITNet), resulting in a lightweight model called Light-weight Image Transform Net (LITNet). Additionally, we introduce a novel distillation loss to convert unsupervised learning into supervised learning, aiming to enhance generation quality.
-
A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid
S. Zhang, W. Mao, H. Shi, Z. Wang
Design, Automation and Test in Europe Conference (DATE 2024) -
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Z. Yu, Z. Wang, Y. Fu, H. Shi, K. Shaikh, Y. Lin
International Conference on Machine Learning (ICML 2024) -
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
H You, Y Guo, Y Fu, W Zhou, H. Shi, X Zhang, S Kundu, A Yazdanbakhsh, Y Lin
Thirty-eight Conference on Neural Information Processing Systems (NeurIPS 2024) -
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
H. You, Z. Sun, H. Shi, Z. Yu, Y. Zhao, Y. Zhang, C. Li, B. Li, Y. Lin
International Symposium on High-Performance Computer Architecture (HPCA 2023) -
Instant-3D: Instant Neural Radiance Field Training Towards On-Device AR/VR 3D Reconstruction
S. Li, C. Li, W. Zhu, C. Wan, H. You, H. Shi, Y. Lin
International Symposium on Computer Architecture (ISCA 2023) -
S$^2$R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution
M. She, W. Mao, H. Shi, Z Wang
International Conference on Artificial Neural Networks (ICANN 2023) -
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
H. You, B. Li, H. Shi, Y. Lin
International Conference on Machine Learning (ICML 2022) -
Intelligent Typography: Artistic Text Style Transfer for Complex Texture and Structure
W. Mao, S. Yang, H. Shi, J. Liu, Z. Wang
IEEE Transactions on Multimedia (TMM 2022) -
Max-Affine Spline Insights Into Deep Network Pruning
H. You, R. Balestriero, Z. Lu, Y. Kou, H. Shi, S. Zhang, S. Wu, Y. Lin
Transactions on Machine Learning Research (TMLR 2022)
🎖️Honors and Awards
- 2024.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University
- 2023.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University.
- 2022.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University.
- 2021.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University.
- 2020.09 President’s Special Scholarship for Doctoral Candidate of Nanjing University.
- 2019.10 Post and Telecommunications Alumni Scholarship for Undergraduates of Jilin University.
- 2019.09 The First-Class Academic Scholarship for Undergraduates of Jilin University.
- 2018.09 The First-Class Academic Scholarship for Undergraduates of Jilin University.
- 2017.09 National Scholarship Award for Undergraduates Issued by Ministry of Education of China.
📝Review Experience
- [Conference] I served as a reviewer for ICLR 2025 and NeurIPS 2025.
- [Journal] I served as a reviewer for IEEE Transactions on Neural Networks and Learning Systems (TNNLS) and IEEE Transactions on Circuits and Systems (TCAS).