邮箱登录 | 所务办公 | 收藏本站 | English | 中国科学院
 
首页 计算所概况 新闻动态 科研成果 研究队伍 国际交流 技术转移 研究生教育 学术出版物 党群园地 科学传播 信息公开
国际交流
交流动态
学术活动
学术交流
现在位置:首页 > 国际交流 > 学术活动
Efficient Methods and Hardware for Deep Learning
2017-09-14 | 【 【打印】【关闭】

  报告时间:2017年9月14日(周四)上午 10:00-12:00

  报告地点:计算所 446室

  主讲人:Song Han Stanford University

  报告摘要:

  Deep learning has spawned a wide range of AI applications that are changing our lives. However, deep neural networks are both computationally and memory intensive. Thus they are power hungry when deployed on embedded systems and data centers with a limited power budget. To address this problem, I will present an algorithm and hardware co-design methodology for improving the efficiency of deep learning.

  I will first introduce "Deep Compression", which can compress deep neural network models by 18-49× without loss of prediction accuracy for a broad range of CNN, RNN, and LSTMs. The compression reduces both computation and storage. Next, by changing the hardware architecture and efficiently implementing Deep Compression, I will introduce EIE, the Efficient Inference Engine, which can perform decompression and inference simultaneously, saving a significant amount of memory bandwidth. By taking advantage of the compressed model and being able to deal with an irregular computation pattern efficiently, EIE achieves 13× speedup and 3000× better energy efficiency over GPU. Finally, I will revisit the inefficiencies in current learning algorithms, present DSD training, and discuss the challenges and future work in efficient deep learning.

  At the end of the seminar, I will introduce the Ph.D. recruitment in the area of artificial intelligence and computer architecture at MIT.

  主讲人简介:

  Song Han graduated from Stanford University advised by Prof. Bill Dally. He will join MIT EECS as assistant professor in July 2018. His research focuses on energy-efficient deep learning, at the intersection between machine learning and computer architecture. He proposed the Deep Compression algorithm, which can compress neural networks by 18-49× while fully preserving prediction accuracy. He designed the first hardware accelerator that can perform inference directly on a compressed sparse model, which results in significant speedup and energy saving. His work has been featured by O’Reilly, TechEmergence, TheNextPlatform, and Embedded Vision, and it has impacted the industry. He led research efforts in model compression and hardware acceleration that won the Best Paper Award at ICLR’16 and the Best Paper Award at FPGA’17. Before joining Stanford, Song graduated from Tsinghua University.

 
网站地图 | 联系我们 | 意见反馈 | 所长信箱
 
京ICP备05002829号 京公网安备1101080060号