PyTorch - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Facebook -- TVM AWS Meetup Talk

GRU units and FC layers - 24kHz sampling frequency requires 40us sampling net runtime - First PyTorch model used a 3,400us sampling net runtime Image from LPCNetExit, Pursued By A Bear - 3400us (baseline) (baseline), 40us (target) - 85x speedup - Uh ohEnter, TVM and model co-design - PyTorch operator overhead makes interpreter infeasible - Reduce FLOPs with block-sparsified weight matrices - not trades off icache/ dcache - also available today in FBGEMMPyTorch and TVM - Lots of opportunity in PyTorch - Graph optimization - Existing fusion infrastructure fairly limited (CUDA-only, injective-only)

0 码力 | 11 页 | 3.08 MB | 5 月前
3

共 1 条前往

页

Facebook TVM AWS Meetup Talk