Hello, welcome to my page! 🎉🎉

My name is Huihong Shi, and I am currently a 5th-year Ph.D. student in the School of Electronic Science and Engineering at Nanjing University, under the supervision of Prof. Zhongfeng Wang (IEEE Fellow). From April 2021 to December 2022, I was a visiting student at the EIC Lab at Rice University, supervised remotely by Prof. Yingyan Lin. I also visited the EIC Lab at the Georgia Institute of Technology in person from December 2023 to May 2024.

My research interests include holistic and efficient machine learning systems via algorithm and hardware co-design.

🔥🔥 I am actively seeking Postdoc opportunities starting in Fall 2025! My CV can be viewed and downloaded here!

📖 Educations

  • 2020.09 - now, Nanjing University, School of Electronic Science and Engineering
    Ph.D. student supervised by Prof. Zhongfeng Wang (IEEE Fellow)

  • 2023.12 - 2024.05, Georgia Institute of Technology, School of Computer Science
    Vsiting student supervised by Prof. Yingyan (Celine) Lin

  • 2021.03 - 2022.12, Rice University (Remote), Schole of Electrical and Computer Engineering
    Visiting student supervised by Prof. Yingyan (Celine) Lin

  • 2016.09 - 2020.06, Jilin University, School of Communication Engineering, Bachelor (Top 2%)

📝 Publications

TCAS-I 2024
sym

Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer.

Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang

[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2024)

[Abstract] We propose Trio-ViT to marry the hardware efficiency of both quantization and efficient Vision Transformer (ViT) architectures, which (1) eliminate the troublesome Softmax and (2) integrate linear attention with low computational complexity.

ISCAS 2024
sym

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT.

Haikuo Shao, Huihong Shi, Wendong Mao, Zhongfeng Wang

[Conference] IEEE International Symposium on Circuits and Systems (ISCAS 2024)

[Abstract] We propose an FPGA-based accelerator for EfficientViT to advance the hardware efficiency frontier of ViTs. Specifically, we design a reconfigurable architecture to efficiently support various operation types, including lightweight convolutions and attention, boosting hardware utilization. Additionally, we present a time-multiplexed and pipelined dataflow to facilitate both intra- and inter-layer fusions, reducing off-chip data access costs.

TCAS-I 2024
sym

NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models.

Yang Xu*, Huihong Shi*, Zhongfeng Wang (*Co-first Authors)

[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2024)

[Abstract] We propose a Neural Architecture and Accelerator Co-Search framework (NASH) for multiplication-reduced hybrid models. We introduce a tailored zero-shot metric for architecture search to pre-identify promising models, enhancing efficiency and reducing gradient conflicts. For accelerator search, we use a coarse-to-fine approach to streamline the process. By integrating both searches, NASH achieves optimal model and accelerator pairing.

TVLSI 2024
sym

P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer.

Huihong Shi, Xin Cheng, Wendong Mao, Zhongfeng Wang

[Journal] IEEE Transactions on Very Large Scale Integration Systems (TVLSI 2024)

[Abstract] We propose a Power-of-Two (PoT) post-training quantization and acceleration framework (P$^2$-ViT) for Vision Transformers (ViTs). We first propose a dedicated quantization scheme with PoT scaling factors to minimize re-quantization overhead and further develop a dedicated accelerator to enhance throughput.

TCAS-I 2024
sym

NASA-F: FPGA-Oriented Search and Acceleration for Multiplication-Reduced Hybrid Networks.

Huihong Shi, Yang Xu, Wendong Mao, Zhongfeng Wang

[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2024)

[Abstract] We propose an FPGA-oriented search and acceleration framework called NASA-F for multiplication-reduced hybrid models. Specifically, we fully leverage the diverse hardware resources (DSPs and LUTs) available on FPGAs to accelerate heterogeneous layers in hybrid models, aiming to enhance both hardware utilization and throughput.

NeurIPS 2023
sym

ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer.

Haoran You*, Huihong Shi*, Yipin Guo*, Zhongfeng Wang (*Co-first Authors)

[Conference] Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

[Abstract] We reparameterize pre-trained Vision Transformers (ViTs) with a mixture of multiplication primitives, such as bitwise shifts and additions, to obtain ShiftAddViT. This approach aims to achieve end-to-end inference speedups on GPUs without requiring training from scratch.

HPCA 2023
sym

ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention.

Jyotikrishna Dass*, Shang Wu*, Huihong Shi*, Chaojian Li, Zhifan Ye, Zhongfeng Wang, Yingyan Lin (*Co-first Authors)

[Conference] International Symposium on High-Performance Computer Architecture (HPCA 2023)

[Abstract] We propose VITALITY to boost the inference efficiency of Vision Transformers (ViTs) via algorithm-hardware co-design. We first approximate the vanilla softmax with first-order Taylor attention for linear complexity and unifies low-rank and sparse components to enhance accuracy. We further develop a dedicated accelerator that leverages the linearized workload to improve hardware efficiency.

TCAS-I 2023
sym

NASA$+$: Neural Architecture Search and Acceleration for Multiplication-Reduced Hybrid Networks.

Huihong Shi, Haoran You, Zhongfeng Wang, Yingyan Lin

[Journal] IEEE Transactions on Circuits and Systems I (TCAS-I 2023)

[Abstract] Inspired by the fact that multiplications can be mathematically decomposed into bit-wise shifts and additions, we design reconfigurable PEs to simultaneously support multiplication-based convolutions, shift, and adder layers, thus enhancing the flexibility of our accelerator.

ICCAD 2022
sym

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks.

Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin

[Conference] International Conference on Computer-Aided Design (ICCAD 2022)

[Abstract] We propose a Neural Architecture Search and Acceleration framework (NASA) to enable automated search and acceleration of multiplication-reduced models, aiming to marry the powerful performance of multiplication-based models and the hardware efficiency of multiplication-free models.

IJCNN 2021
sym

LITNet: A Light-weight Image Transform Net for Image Style Transfer.

Huihong Shi, Wendong Mao, Zhongfeng Wang

[Conference] International Joint Conference on Neural Networks (IJCNN 2021)

[Abstract] We propose a compression algorithm for one of the influential CNN-based style transfer networks, named Image Transform Net (ITNet), resulting in a lightweight model called Light-weight Image Transform Net (LITNet). Additionally, we introduce a novel distillation loss to convert unsupervised learning into supervised learning, aiming to enhance generation quality.

🎖️Honors and Awards

  • 2024.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University
  • 2023.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University.
  • 2022.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University.
  • 2021.09 The First-Class Academic Scholarship for Postgraduate Students at Nanjing University.
  • 2020.09 President’s Special Scholarship for Doctoral Candidate of Nanjing University.
  • 2019.10 Post and Telecommunications Alumni Scholarship for Undergraduates of Jilin University.
  • 2019.09 The First-Class Academic Scholarship for Undergraduates of Jilin University.
  • 2018.09 The First-Class Academic Scholarship for Undergraduates of Jilin University.
  • 2017.09 National Scholarship Award for Undergraduates Issued by Ministry of Education of China.

📝Review Experience

  • [Conference] I served as a reviewer for ICLR 2025 and NeurIPS 2025.
  • [Journal] I served as a reviewer for IEEE Transactions on Neural Networks and Learning Systems (TNNLS) and IEEE Transactions on Circuits and Systems (TCAS).