TVM Meetup: QuantizationDialect • Design • Operators • Results on Intel Cascade Lake© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Quantized Operators in Framework • New operators like TF quantized_conv2d quantized_conv2d • Underlying calculations are different than FP32 conv2d • Sometimes operators are aggressively fused • TFLite fuses quantized_conv2d, bias, relu and requantize 𝑟𝑒𝑎𝑙_𝑣𝑎𝑙𝑢𝑒 = 𝒔𝒄𝒂𝒍𝒆 Web Services, Inc. or its Affiliates. All rights reserved. How to Support Framework Quantized Operators? Option 1 – Completely add new ops from scratch • New Relay passes and TVM schedules required0 码力 | 19 页 | 489.50 KB | 5 月前3
Bring Your Own Codegen to TVMDesign and manufacture a deep learning chip which achieves amazing performance on widely-used operators (e.g. conv2d, dense, ReLU, etc) Now your customer wants to run a YOLO model, but... ResNet-50 Runtime, Interpreter) Your Dispatcher Target Device General Devices (CPU/GPU/FPGA) Mark supported operators or subgraphs 1. Implement an operator-level annotator, OR 2. Implement a graph-level annotator© Runtime, Interpreter) Your Dispatcher Target Device General Devices (CPU/GPU/FPGA) Mark supported operators or subgraphs 1. Implement extern operator functions, OR 2. Implement a graph annotator© 2019, Amazon0 码力 | 19 页 | 504.69 KB | 5 月前3
TVM@Alibaba AI LabsTensor Operators & Runtime Property Registr \L Compiler Toolchain 于 TVM TOPI Schedule Primitives & Optimizations Symbols NNVM & Param Frontends Operators Algorithm0 码力 | 12 页 | 1.94 MB | 5 月前3
Gluon DeploymentServices, Inc. or its Affiliates. All rights reserved. Amazon Trademark Effects of Convolution operators using TVM AWS DeepLens Acer aiSage NVIDIA Jetson Nano Speedup 0 2 4 6 8 SSD_MobileNet10 码力 | 8 页 | 16.18 MB | 5 月前3
OctoML OSS 2019 11 8AuroTYM QQ octoML AutoTVM on HTVM DTYM Runtime send program 较 ,we 人 Interace Optimize TVM operators on microcontrollers by making use of AutoTVM0 码力 | 16 页 | 1.77 MB | 5 月前3
TVM: Where Are We GoingFleetExisting Deep Learning Frameworks High-level data flow graph Hardware Primitive Tensor operators such as Conv2D eg. cuDNN Offload to heavily optimized DNN operator library FrameworksLimitations0 码力 | 31 页 | 22.64 MB | 5 月前3
DeepSeek-V2: A Strong, Economical, and Efficient
Mixture-of-Experts Language Modelal., 2020). Given that DeepSeek-V2 has relatively few activated parameters, and a portion of the operators are recomputed to save acti- vation memory, it can be trained without the necessity of tensor parallelism0 码力 | 52 页 | 1.23 MB | 1 年前3
Trends Artificial Intelligence
on it fastest, personalize it deepest, and deploy it widest. *Hyperscalers (large data center operators) are Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), Alibaba Cloud, Oracle0 码力 | 340 页 | 12.14 MB | 4 月前3
共 8 条
- 1













