top of page
2024
TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading [PDF]
Kun Wu*, Jeongmin Brian Park*, Xiaofan Zhang*, Mert Hidayetoğlu, Vikram Sharma Mailthody, Sitao Huang, Steven Sam Lumetta, Wen-mei Hwu (*equal contributors)
arXiv:2408.10013, 2024
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization [PDF]
Haoran You, Yipin Guo, Yichao Fu, Wei Zhou, Huihong Shi, Xiaofan Zhang, Souvik Kundu, Amir Yazdanbakhsh, Yingyan Lin
38th Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec. 2024
New Solutions on LLM Acceleration, Optimization, and Application [PDF]
[Invited] Yingbing Huang, Jiaxin Wan, Hanchen Ye, Manvi Jha, Jinghua Wang, Yuhong Li, Xiaofan Zhang, Deming Chen
61st Design Automation Conference (DAC), San Francisco, CA, June 2024
AutoAI2C: An Automated Hardware Generator for DNN Acceleration on both FPGA and ASIC [PDF]
Yongan Zhang, Xiaofan Zhang, Pengfei Xu, Yang Zhao, Cong Hao, Yue Wang, Chaojian Li, Deming Chen, Yingyan Lin
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2024
Software/Hardware Co-design for LLM and Its Application for Design Verification [PDF]
Jiaxin Wan, Yingbing Huang, Yuhong Li, Hanchen Ye, Jinghua Wang, Xiaofan Zhang, Deming Chen
29th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2024
HomeSGN: A Smarter Home with Novel Rule Mining Enabled by a Scorer-Generator GAN
Zehua Yuan, Junhao Pan, Xiaofan Zhang, Deming Chen
29th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2024
2023
Compilation and Optimizations for Efficient Machine Learning on Embedded Systems [PDF]
Xiaofan Zhang, Yao Chen, Cong Hao, Sitao Huang, Yuhong Li, Deming Chen
Book chapter in Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing: Software Optimizations and Hardware/Software Co-design, Springer Nature
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization [PDF]
Clemens Schaefer, Navid Lambert-Shirzad, Xiaofan Zhang, Chiachen Chou, Tom Jablin, Jian Li, Elfie Guo, Caitlin Stanton, Siddharth Joshi, Yu Emma Wang
arXiv preprint arXiv:2306.04879, 2023
EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search [PDF]
Qian Jiang*, Xiaofan Zhang*, Deming Chen, Minh N. Do, Raymond A. Yeh (*equal contributors)
40th International Conference on Machine Learning (ICML) Workshop on Differentiable Almost Everything, July 2023
2022
YouHome System and Dataset: Making Your Home Know You Better
[Invited] Junhao Pan, Zehua Yuan, Xiaofan Zhang, Deming Chen
IEEE International Symposium on Smart Electronic Systems (iSES), Dec. 2022
Algorithm/Accelerator Co-Design and Co-Search for Edge AI [PDF]
Xiaofan Zhang, Yuhong Li, Junhao Pan, Deming Chen
IEEE Transactions on Circuits and Systems II, 2022
Exploring HW/SW Co-Design for Video Analysis on CPU-FPGA Heterogeneous Systems [PDF]
Xiaofan Zhang, Yuan Ma, Jinjun Xiong, Wen-mei Hwu, Volodymyr Kindratenko, Deming Chen
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 41, pp. 1606-1619, 2022
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models [PDF]
Xiaofan Zhang, Zongwei Zhou, Deming Chen, Yu Emma Wang
arXiv preprint: 2201.08539, 2022
2021
F-CAD: A Framework to Explore Hardware Accelerators for Codec Avatar Decoding [PDF]
Xiaofan Zhang, Dawei Wang, Pierce Chuang, Shugao Ma, Deming Chen, Yuecheng Li
58th Design Automation Conference (DAC), San Francisco, CA, Dec. 2021
Scaling Up Hardware Accelerator Verification using A-QED with Functional Decomposition [PDF]
Saranyu Chattopadhyay, Florian Lonsing, Luca Piccolboni, Deepraj Soni, Peng Wei, Xiaofan Zhang, Yuan Zhou, Luca Carloni, Deming Chen, Jason Cong, Ramesh Karri, Zhiru Zhang, Caroline Trippel, Clark Barrett, Subhasish Mitra
21st Formal Methods in Computer-Aided Design (FMCAD), New Haven, CT, Oct. 2021
Exploring HW/SW Co-Optimizations for Accelerating Large-scale Texture Identification on Distributed GPUs [PDF]
Junsong Wang, Xiaofan Zhang, Yubo Li, Yonghua Lin
50th International Conference on Parallel Processing (ICPP), Lemont, IL, Aug. 2021 (Virtual)
Efficient Methods for Mapping Neural Machine Translator on FPGAs [PDF]
Qin Li*, Xiaofan Zhang*, Jinjun Xiong, Wen-mei Hwu, Deming Chen (*equal contributors)
IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 32, pp. 1866 - 1877, 2021
Being-ahead: Benchmarking and Exploring Accelerators for Hardware-Efficient AI Deployment [PDF]
Xiaofan Zhang, Hanchen Ye, Deming Chen
Conference on Machine Learning and Systems (MLSys) workshop on MLBench, Apr. 2021 (Virtual)
2020
DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator [PDF]
Xiaofan Zhang*, Hanchen Ye*, Junsong Wang, Yonghua Lin, Jinjun Xiong, Wen-mei Hwu, Deming Chen (*equal contributors)
39th International Conference on Computer Aided Design (ICCAD), San Diego, CA, Nov. 2020 (Virtual)
Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices [PDF]
[Invited] Cong Hao, Yao Chen, Xiaofan Zhang, Yuhong Li, Jinjun Xiong, Wen-mei Hwu, Deming Chen
30th ACM Great Lakes Symposium on VLSI (GLSVLSI), Sep. 2020 (Virtual)
A-QED Verification of Hardware Accelerators [PDF]
Eshan Singh, Florian Lonsing, Saranyu Chattopadhyay, Max Strange, Peng Wei, Xiaofan Zhang, Yuan Zhao, Jason Cong, Deming Chen, Zhiru Zhang, Priyankja Raina, Clark Barrett, and Subhasish Mitra
57th Design Automation Conference (DAC), San Francisco, CA, July 2020 (Virtual)
EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions [PDF]
Yuhong Li, Cong Hao, Xiaofan Zhang, Xinheng Liu, Yao Chen, Jinjun Xiong, Wen-mei Hwu, Deming Chen
57th Design Automation Conference (DAC), San Francisco, CA, July 2020 (Virtual)
HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation [PDF]
Hanchen Ye,Xiaofan Zhang, Zhize Huang, Gengsheng Chen, Deming Chen
57th Design Automation Conference (DAC), San Francisco, CA, July 2020 (Virtual)
:
SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems [PDF]
🏆 DAC'19 System Design Contest Champion Design
Xiaofan Zhang, Haoming Lu, Cong Hao, Jiachen Li, Bowen Cheng, Yuhong Li, Kyle Rupnow, Jinjun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen
Conference on Machine Learning and Systems (MLSys), Mar. 2020
AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs [PDF]
Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, Yingyan Lin
28th International Symposium on Field-Programmable Gate Arrays (FPGA), Seaside, CA, Feb. 2020
2019
T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA [PDF]
Yao Chen, Kai Zhang, Cheng Gong, Cong Hao, Xiaofan Zhang, Tao Li, Deming Chen
IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Miami, FL, July 2019
uL2Q: An Ultra-Low Loss Quantization Method for DNN Compression
Cheng Gong, Tao Li, Ye Lu, Cong Hao, Xiaofan Zhang, Deming Chen, Yao Chen
The International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, July 2019
SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection [PDF]
Xiaofan Zhang, Cong Hao, Haoming Lu, Jiachen Li, Yuhong Li, Yuchen Fan, Kyle Rupnow, Jinjun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen
(Technical report) arXiv preprint: 1906.10327, June 2019
A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices [PDF]
🏆 Best Poster Award
Xiaofan Zhang, Cong Hao, Yuhong Li, Yao Chen, Jinjun Xiong, Wen-mei Hwu, Deming Chen
36th International Conference on Machine Learning (ICML) Workshop on ODML-CDNNR, Long Beach, CA, June 2019
FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge [PDF]
Cong Hao*, Xiaofan Zhang*, Yuhong Li, Sitao Huang, Jinjun Xiong, Kyle Rupnow, Wen-mei Hwu, Deming Chen
(*equal contributors)
56th Design Automation Conference (DAC), Las Vegas, NV, June 2019
Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs [PDF]
Yao Chen, Jiong He, Xiaofan Zhang, Cong Hao, Deming Chen
27th International Symposium on Field-Programmable Gate Arrays (FPGA), Seaside, CA, Feb. 2019
SiamVGG: Visual Tracking using Deeper Siamese Networks [PDF]
Yuhong Li, Xiaofan Zhang
arXiv preprint:1902.02804, Feb. 2019
Implementing Neural Machine Translation with Bi-Directional GRU and Attention Mechanism on FPGAs Using HLS [PDF]
Qin Li*, Xiaofan Zhang*, Jinjun Xiong, Wen-mei Hwu, Deming Chen (*equal contributors)
24th Asia and South Pacific Design Automation Conference (ASP-DAC), Tokyo, Japan, Jan. 2019
2018
DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs [PDF]
🏆 Best Paper Award
Xiaofan Zhang, Junsong Wang, Chao Zhu, Yonghua Lin, Jinjun Xiong, Wen-mei Hwu, Deming Chen
37th International Conference on Computer Aided Design (ICCAD), San Diego, CA, Nov. 2018
Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA [PDF]
Junsong Wang, Qiuwen Lou, Xiaofan Zhang, Chao Zhu, Yonghua Lin, Deming Chen
28th International Conference on Field-Programmable Logic and Applications (FPL), Dublin, Ireland, Aug. 2018
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes [PDF]
Yuhong Li, Xiaofan Zhang, Deming Chen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, June 2018
Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs [PDF]
Chuanhao Zhuge, Xinheng Liu, Xiaofan Zhang, Sudeep Gummadi, Jinjun Xiong and Deming Chen
28th ACM Great Lakes Symposium on VLSI (GLSVLSI), Chicago, IL, May 2018
AccDNN: an IP-based DNN Generator for FPGAs
Xiaofan Zhang, Junsong Wang, Chao Zhu, Yonghua Lin, Jinjun Xiong, Wen-mei Hwu, Deming Chen
26th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Boulder, CO, April 2018
2017
An Energy Efficient Approach for C4.5 Algorithm using OpenCL Design Flow [PDF]
Hai Peng, Xiaofan Zhang, Letian Huang
16th International Conference on Field-Programmable Technology (FPT), Melbourne, Australia, December 2017
Machine Learning on FPGAs to Face the IoT Revolution [PDF]
[Invited] Xiaofan Zhang*, Anand Ramachandran*, Chuanhao Zhuge*, Di He, Wei Zuo, Zuofu Cheng, Kyle Rupnow, Deming Chen (*equal contributors)
36th International Conference on Computer Aided Design (ICCAD), Irvine, CA, November 2017
High-Performance Video Content Recognition with Long-term Recurrent Convolutional Network for FPGA [PDF]
Xiaofan Zhang, Xinheng Liu, Anand Ramachandran, Chuanhao Zhuge, Shibin Tang, Peng Ouyang, Zuofu Cheng, Kyle Rupnow, Deming Chen
27th International Conference on Field-Programmable Logic and Applications (FPL), Ghent, Belgium, September 2017
2016
Tolerating transient illegal turn faults in NoCs [PDF]
Letian Huang, Xiaofan Zhang, Masoumeh Ebrahimi, Guangjun Li
Microprocessors and Microsystems, Vol. 43, pp. 104-115, 2016
Non-Blocking Testing for Network-on-Chip [PDF]
Letian Huang, Junshi Wang, Masoumeh Ebrahimi, Masoud Daneshtalab, Xiaofan Zhang, Guangjun Li, Axel Jantsch
IEEE Transactions on Computers, Vol. 65, pp. 679-692, 2016
2015
Fault-Resilient Routing Unit in NoCs [PDF]
Xiaofan Zhang, Ebrahimi Masoumeh, Letian Huang, Guangjun Li
28th IEEE International System-on-Chip Conference (SOCC), Beijing, China, September 2015
A Network-Level Solution for Fault Detection, Masking, and Tolerance in NoCs [PDF]
Xiaofan Zhang, Ebrahimi Masoumeh, Letian Huang, Guangjun Li, Axel Jantsch
23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), Turku, Finland, March 2015
bottom of page