ARM CPU - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

TVM@AliOS

TVMQ@Alios AIOS ! 驱动万物智能 PRESENTATION AGENDA 人人 e 人 e@ TVM Q@ AliOs Overview TVM @ AliOs ARM CPU TVM @ AliOos Hexagon DSP TVM @ Alios Intel GPU Misc /NiiOS ! 驱动万物智能 PART ONE TVM Q@ AliOs Overview Multimodal Interection CPU (ARM、Intel) 1驱动万物智能 Accelerated Op Library / Others Inference Engine DSP (Qualcomm) PART TWO Alios TVM @ ARM CPU AiOS 1驱动万物智能 Alios TVMQOARM CPU 。 Support TFLite ( Open Open Source and Upstream Master ) 。， Optimize on INT8 & FP32 AiiOS ! 驱动万物智能 Alios TVM @ ARM CPU INT8 * Cache 芍四 Data FO Data FOData … QNNPACK Convolution 。，NHWC layout Cach，浆百

0 码力 | 27 页 | 4.86 MB | 5 月前
3
TVM Meetup Nov. 16th - Linaro

2019Bringing together the Arm ecosystemLinaro AI Initiative Provide the best-in-class Deep Learning performance by leveraging Neural Network acceleration in IP and SoCs from the Arm ecosystem, through collaborative Internal Jira project restricted to Linaro members ● Three sub-projects: ○ Arm Compute Library ○ Arm NN ○ Android NN Driver ● Arm Compute Library has been integrated by: ○ MATLAB Coder ○ ONNX RuntimeArm upstream IPs Target Hardware/Model Options Codegen CPU arm_cpu pixel2 (snapdragon 835), mate10/mate10pro (kirin 970), p20/p20pro (kirin 970) -target=arm64-linux-android -mattr=+neon llvm firefly rk3399

0 码力 | 7 页 | 1.23 MB | 5 月前
3
TVM Meetup: Quantization

Target-independent Relay passes Target-optimized graph Target-dependent Relay passes Intel x86 ARM CPU Nvidia GPU ARM GPU Schedule templates written in TVM Tensor IR .. More targets AutoTVM – Tuning the Target-independent Relay passes Target-optimized Int8 Relay Graph Intel x86 schedule ARM CPU schedule Nvidia GPU schedule ARM GPU schedule Relay Int8 Graph Target-dependent Relay layout opt© 2019, Amazon Target-independent Relay passes Target-optimized Int8 Relay Graph Intel x86 schedule ARM CPU schedule Nvidia GPU schedule ARM GPU schedule Relay Int8 Graph Target-dependent Relay layout opt© 2019, Amazon

0 码力 | 19 页 | 489.50 KB | 5 月前
3
TVM@Alibaba AI Labs

Labs 阿里巴巴人工智能实验室 AiILabs & TVM PART 1 : ARM32 CPU CONTENT PART 2 : HIFI4 DSP PART 3 : _ PowervVR GPU [和| Alibaba AL.Labs 阿里巴巴人工智能实验室 ARM 32 CPU Resolution Quantization Orize Kernel ALIOS ent pl 1=int8 int8 * int8 int32 = int16 1 + int16 x int8 Alibaba Al.Labs 阿里巴巴人工智能实验室 CPU : MTK8167S (ARM32 A35 1.5GHz) Model : MobileNetV2_ 1.0_ 224 400 336 350 3丈 300 250

0 码力 | 12 页 | 1.94 MB | 5 月前
3
亿联TVM部署

performance gain by autotuning 3. TVM can support many kinds of hardware platform: Intel/arm CPU, Nividia/arm GPU, VTA…5 �� 1. Get a .log file from the autotvm on Ubuntu 2. Use the .log

0 码力 | 6 页 | 1.96 MB | 5 月前
3
TVM: Where Are We Going

Haichen Shen et.aluTVM: TVM on bare-metal Devices Support bare-metal J-TAG devices, no OS is needed ARM Cortex-M RISC-V Credit: Logan WeberuTVM upcoming: Self Hosted Runtime Credit: Logan WeberDesigned Runtime JIT compile accelerator micro code • Support heterogenous devices, 10x better than CPU on the same board. • Move hardware complexity to software HW-SW Blueprint for Flexible Deep Learning

0 码力 | 31 页 | 22.64 MB | 5 月前
3
XDNN TVM - Nov 2019

20% 40% 60% 80% 100% VGG16 ResNet-50 GoogleNet-V3 Aristotle on 7020 FPGA Iphone8plus Kirin 970 CPU MEM CONTROLLER BUS Data Mover IMG WR SCHEDULER WEIGHTS WR SCHEDULER SMART MEM FABRIC IMG RD Efficiency > 50% for mainstream neural networks >> 4© Copyright 2018 Xilinx Inference Flow >> 5 MxNet CPU Layers FPGA Layers Runtime Image Model Weights Calibration Set Quantizer Compiler Tensor Graph TVM Partitioning >> 7 Subgraph 1 Parallel Subgraphs Post-Processing Pre-Processing FPGA or CPU FPGA CPU CPU FPGA - More than supported/not supported, pattern matching graph colorization - Choices how

0 码力 | 16 页 | 3.35 MB | 5 月前
3
Bring Your Own Codegen to TVM

build_extern(mod, “dnnl”) 4. Run the inference exe = relay.create_executor(“vm”, mod=mod, ctx=tvm.cpu(0)) data = np.random.uniform(size=(1, 3, 224, 224)).astype(“float32”) out = exe.evaluate()(data, **params) Relay Runtime (VM, Graph Runtime, Interpreter) Your Dispatcher Target Device General Devices (CPU/GPU/FPGA) Mark supported operators or subgraphs 1. Implement an operator-level annotator, OR 2. Implement Relay Runtime (VM, Graph Runtime, Interpreter) Your Dispatcher Target Device General Devices (CPU/GPU/FPGA) Mark supported operators or subgraphs 1. Implement extern operator functions, OR 2. Implement

0 码力 | 19 页 | 504.69 KB | 5 月前
3
Dynamic Model in TVM

function CPU strategy func GPU strategy func OpStrategy OpStrategy OpStrategy Default implement Specialized implement 1 Specialized implement 2 (e.g., winograd) kernel_size <= 3 b < 8 “cpu” “gpu”© Affiliates. All rights reserved. How to register a strategy? @conv2d_strategy.register("cpu") def conv2d_strategy_cpu(attrs, inputs, out_type, target): strategy = OpStrategy() layout = attrs.data_layout

0 码力 | 24 页 | 417.46 KB | 5 月前
3
Facebook -- TVM AWS Meetup Talk

(~10 lines of Relay IR) - A few days of work - TVM sampling model running in 30us on single server CPU core - Beat hand-written, highly optimized baselines (https://github.com/mozilla/LPCNet) by ~40%

0 码力 | 11 页 | 3.08 MB | 5 月前
3

共 11 条前往

页

TVM AliOS Meetup Nov 16th Linaro Quantization Alibaba AI Labs 亿联部署 Where Are We Going XDNN 2019 Bring Your Own Codegen to Dynamic Model in Facebook AWS Talk

分类

语言

格式

TVM@AliOS

TVM Meetup Nov. 16th - Linaro

TVM Meetup: Quantization

TVM@Alibaba AI Labs

亿联TVM部署

TVM: Where Are We Going

XDNN TVM - Nov 2019

Bring Your Own Codegen to TVM

Dynamic Model in TVM

Facebook -- TVM AWS Meetup Talk