路航  副研究员  

研究方向:

所属部门:处理器芯片重点实验室

导师类别:硕导计算机系统结构

联系方式:luhang@ict.ac.cn

个人网页:https://luhang-hpu.github.io/

简       历:

工作经历

2019/09 - 至今:中国科学院计算技术研究所,计算机体系结构国家重点实验室,副研究员,硕士生导师,中科院青促会成员,计算所新百星

2015/07 - 2019/09:中国科学院计算技术研究所,计算机体系结构国家重点实验室,助理研究员

教育经历

2011/09 - 2015/06:中国科学院计算技术研究所,计算机体系结构国家重点实验室,博士,导师:李晓维

2008/09 - 2011/03:北京航空航天大学,电子信息工程,硕士

2004/09 - 2008/06:北京航空航天大学,电子信息工程,学士

主要论著:

代表性论文

“General Purpose Deep Learning Accelerator Based On Bit Interleaving",Liang Chang, Hang Lu (路航), Chenglong Li, Xin Zhao, Zhicheng Hu, Jun Zhou, Xiaowei Li,IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD, CCF A类), Early Access.

“Mortar-FP8: Morphing the Existing FP32 Infrastructure for High Performance Deep Learning Acceleration”,Hongyan Li, Hang Lu* (路航), Xiaowei Li,IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD, CCF A类), Early Access.

“Poseidon-NDP: Practical Fully Homomorphic Encryption Accelerator Based on Near Data Processing Architecture”,Yinghao Yang, Hang Lu* (路航), Xiaowei Li,IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD, CCF A类), 2023.

“Poseidon: Practical Homomorphic Encryption Accelerator”,Yinghao Yang, Huaizhi Zhang, Shengyu Fan, Hang Lu* (路航), Mingzhe Zhang, Xiaowei Li,IEEE 29th International Symposium on High-Performance Computer Architecture (HPCA, CCF A类), 2023.

“BitXpro: Regularity-aware Hardware Runtime Pruning for Deep Neural Networks”,Hongyan Li, Hang Lu* (路航), Haoxuan Wang, Shengji Deng, Xiaowei Li,IEEE Transactions on Very Large Scale Integration Systems (TVLSI, CCF B类),2023.

”Mortar: Morphing the Bit Level Sparsity for General Purpose Deep Learning Acceleration“,Yunhung Gao, Hongyan Li, Kevin Zhang, Xueru Yu, Hang Lu* (路航),ACM 28th Asia and South Pacific Design Automation Conference (ASPDAC, CCF C类), 2023.


“Distilling Bit-level Sparsity Parallelism for General Purpose Deep Learning Acceleration”, Hang Lu (路航), Liang Chang, Chenglong Li, Zixuan Zhu, Shengjian Lu, Yanhuan Liu, Mingzhe Zhang, IEEE/ACM 54th International Symposium on Microarchitecture (MICRO, CCF A类),2021.

"Streamline Ring ORAM Accesses through Spatial and Temporal Optimization”, Dingyuan Cao, Mingzhe Zhang, Hang Lu (路航), Xiaochun Ye, Dongrui Fan, Yuezhi Che, Rujia Wang, IEEE 27th International Symposium on High-Performance Computer Architecture (HPCA, CCF A), 2021.

“BitX: Empower Versatile Inference for Hardware Runtime Pruning”, Hongyan Li, Hang Lu* (路航), Jiawen Huang, Wenxu Wang, Mingzhe Zhang, Wei Chen, Liang Chang, Xiaowei Li, IEEE/ACM 50th International Conference on Parallel Processing (ICPP, CCF B)2021.

“Chaotic Weights: A Novel Approach to Protect Intellectual Property of Deep Neural Networks”, Ning Lin, Xiaoming Chen, Hang Lu (路航), Xiaowei Li, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD, CCF A), 2021.

“Architecting Effectual Computation for Machine Learning Accelerators”, Hang Lu* (路航), Mingzhe Zhang, Yinhe Han, Qi Wang, Huawei Li, Xiaowei Li, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD, CCF A), 2020.

“ShuttleNoC: Power-Adaptable Communication Infrastructure for Many-core Processors”, Hang Lu* (路航), Yisong Chang, Ning Lin, Xin Wei, Guihai Yan, Xiaowei Li, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD, CCF A), 2019.

“HeadStart: Enforce Optimal Inceptions in Pruning Deep Convolutional Neural Networks for Efficient Inference on GPGPUs”, Ning Lin, Hang Lu* (路航), Xin Wei, Xiaowei Li, IEEE/ACM 56th International Design Automation Conference (DAC, CCF A), 2019.

“When Deep Learning Meets the Edge: Auto-Masking Deep Neural Networks for Efficient Machine Learning on Edge Devices”, Ning Lin, Hang Lu* (路航), Xing Hu, Jingliang Gao, Xiaowei Li, IEEE 37th International Conference on Computer Design (ICCD, CCF B), 2019.

“VNet: A Versatile Deep Neural Network Model for Efficient Semantic Segmentation”, Ning Lin, Hang Lu* (路航), Xiaowei Li, IEEE 37th International Conference on Computer Design (ICCD, CCF B), 2019.

“Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators”, Hang Lu* (路航), Xin Wei, Ning Lin, Guihai Yan, Xiaowei Li, IEEE/ACM 37th International Conference on Computer-Aided Design (ICCAD, CCF B), 2018.

“Redeeming Chip-level Power Efficiency by Collaborative Management of the Computation and Communication”, Ning Lin, Hang Lu* (路航), Xin Wei, Xiaowei Li, ACM 24th Asia and South Pacific Design Automation Conference (ASPDAC, CCF C), 2018.

“PowerTrader: Enforcing Autonomous Power Management for Large-scale Many-core Processors”, Hang Lu* (路航), Guihai Yan, Yinhe Han, Xiaowei Li, IEEE Transactions on Multi-scale Computing Systems (TMSCS), 2017.

“RISO: Enforce Non-interfered Performance with Relaxed Network-on-Chip Isolation in Manycore Cloud Processors”, Hang Lu* (路航), Binzhang Fu, Ying Wang, Yinhe Han, Guihai Yan, Xiaowei Li, IEEE Transactions on Very Large Scale Integration Systems (TVLSI, CCF B), 2015.

“ShuttleNoC: Boosting On-chip Communication Efficiency Through Localized Power Adaptation”, Hang Lu* (路航), Guihai Yan, Yinhe Han, Ying Wang, Xiaowei Li, IEEE 20th Asia and South Pacific Design Automation Conference (ASPDAC, CCF C类,最佳论文奖提名), 2015.

“RISO: Relaxed Networks-on-Chip Isolation for Cloud Processors”, Hang Lu* (路航), Guihai Yan, Yinhe Han, Binzhang Fu, Xiaowei Li, IEEE/ACM 50th International Design Automation Conference (DAC, CCF A), 2013.

发明专利


Allocating threads on a non-rectangular area on a NoC based on predicted traffic of a smallest rectangular area,国际发明专利,授权国家:美国US9965335B2

路航,付斌章,韩银和,李晓维

Task allocation method, task allocation apparatus, and network-on-chip,国际发明专利,授权国家:日本JP6094005B2,韩国KR101729596B1,中国CN104156267B

路航,付斌章,韩银和,李晓维

一种权重捏合的卷积神经网络计算方法与系统,中国,201811214323.5

李晓维,魏鑫,路航

一种卷积神经网络加速器的拆分累加器,中国,201811214639.4

李晓维,魏鑫,路航

权重捏合神经网络加速器架构设计,中国,201811214310.8

李晓维,魏鑫,路航

专著和译著


《多核处理器设计优化——低功耗、高可靠、易测试》,科学出版社,2021.

李晓维、路航、李华伟、王颖、鄢贵海

Customizable Computing,可定制计算》,机械工业出版社,2018.

鄢贵海、叶靖、王颖、路航、卢文岩、李家军、吴靖雅 译,Yu-Ting Chen, Jason Cong, Michael Gill, Glenn Reinman, Bingjun Xiao


科研项目:

1) 全同态处理器(CCF-华为、CCF-蚂蚁、CCF-飞腾基金同时资助)。

同态加密计算技术作为一种新型数据安全计算技术,受到了越来越多的关注。首先,同态加密技术可以降低计算供应商数据安全方面的成本。由于同态加密技术允许应用对密文数据直接进行计算,算力提供方无需接触明文数据,即使发生数据泄露,难以给用户造成重大损失。这一特性降低了算力提供方在用户数据安全方面的压力,可以帮助云计算供应商显著降低数据安全维护方面的运营成本。第二,同态加密技术可以吸引更多用户将数据和计算迁移至云端,降低用户的计算成本。由于同态加密计算技术可以直接对密文数据进行处理,算力提供方无需对加密数据进行解密,这将从根本上解决由于数据所有方与算力提供方分离所导致的数据安全风险问题。虽然同态加密在隐私计算领域有着重要的作用,但是巨大的计算开销阻碍了它的实际应用。如何在保证密文计算结果正确性的前提下,提高计算性能和能效是当前业界公认的难题。本项目拟通过软硬件协同设计和优化方法,构建全同态处理器提高其计算性能(第一代命名为“张江壹号”,第二代命名为“扬子江”)。This project is in progress, and we have published 2 papers regarding this project [HPCA’23][TCAD’23]. 商用版全同态处理器加速卡产品“张江壹号”已经成功部署到原语科技隐私计算一体机。我们开发了适配“张江壹号”HPU的全同态计算库,请见Poseidon在线文档。

(2)面向RISC-V的可信执行环境(TEE)(北京开源芯片研究院资助)

可信执行环境(TEE)是一种具有运算和储存功能,能提供安全性和完整性保护的独立处理环境。其基本思想是:在硬件中为敏感数据单独分配一块隔离的内存,所有敏感数据的计算均在这块内存中进行,除了经过授权的接口外,应用程序不能访问这块隔离内存中的信息,以此来实现敏感数据的隐私计算。TEE能够保护运行在其中的代码和数据免受外部攻击,包括来自操作系统、硬件和其他应用程序的攻击。TEE的用途非常广泛,例如提供安全的支付环境,保证支付过程中的安全性和隐私性,Apple Pay等移动支付服务就使用了TEE技术。TEE还可以提供安全的身份验证环境,防止身份被盗用或伪造,例如银行使用TEE技术来保护客户的账户安全。本项目在“香山”系列RISC-V处理器上构建全国产可信执行环境,包括安全固件、安全微结构、硬件可信根、编程API和SDK等。

获奖及荣誉:

[1] 2023,中国电子学会科学技术奖——科技进步奖,二等奖

[2] 2023,北京市科学技术奖——科技进步奖,二等奖

[3] 2022,北京大数据技能大赛——隐私计算赛道“季军”

[4] 2022,隐私计算年度先锋人物Top10、隐私计算年度优秀方案Top10

[5] 2023,CCF-蚂蚁科研基金获奖者

[6] 2023,CCF-飞腾基金获奖者

[7] 2022,CCF-华为胡杨林基金获奖者

[8] 2022,中科院计算所优秀员工

[9] 2021,中国科学院青年创新促进会会员

[10] 2019,中科院计算所“新百星计划

[11] 2018,中科院计算所优秀员工、计算机体系结构国家重点实验室优秀员工

[12] 2015,亚洲太平洋地区设计自动化会议“最佳论文奖”提名(第一作者)

[13] 2015,中国科学院“朱李月华优秀博士生奖”(全国政协委员、香港女性首富朱李月华女士亲切设宴接见的7名优秀博士生代表之一)

[14] 2015,博士生国家奖学金提名

[15] 2015,中国科学院计算技术研究所“所长特别奖”