 Gluon Deploymentreserved. Amazon Trademark Deploy GluonCV Models GluonCV Models MXNet Computational Graph Json Acyclic Graph Export As-is Optimize with TVM© 2019, Amazon Web Services, Inc. or its Affiliates. All0 码力 | 8 页 | 16.18 MB | 5 月前3 Gluon Deploymentreserved. Amazon Trademark Deploy GluonCV Models GluonCV Models MXNet Computational Graph Json Acyclic Graph Export As-is Optimize with TVM© 2019, Amazon Web Services, Inc. or its Affiliates. All0 码力 | 8 页 | 16.18 MB | 5 月前3
 Bring Your Own Codegen to TVMSystem Overview Relay IR Graph Annotation with Your Annotator Graph Partitioning Your Codegen LLVM, CUDA, Metal, VTA Serialized Subgraph Library Relay Runtime (VM, Graph Runtime, Interpreter) Mark supported operators or subgraphs 1. Implement an operator-level annotator, OR 2. Implement a graph-level annotator© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 1: rights reserved. Option 2: Graph-Level Annotation ● Implement a Relay IR visitor to annotate a subgraph ● Module path: python/tvm/relay/op/contrib/ Bring Your Own Codegen to TVMSystem Overview Relay IR Graph Annotation with Your Annotator Graph Partitioning Your Codegen LLVM, CUDA, Metal, VTA Serialized Subgraph Library Relay Runtime (VM, Graph Runtime, Interpreter) Mark supported operators or subgraphs 1. Implement an operator-level annotator, OR 2. Implement a graph-level annotator© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Option 1: rights reserved. Option 2: Graph-Level Annotation ● Implement a Relay IR visitor to annotate a subgraph ● Module path: python/tvm/relay/op/contrib/- /graph_annotator.py ● Apply the annotator 0 码力 | 19 页 | 504.69 KB | 5 月前3
 TVM@AliOS1驱动万物智能 Alios TVM @ Hexagon DSP 人NiOS ! 驱动万物知 Tensorflow deploy.so / deploy.json / deploy.bin | NNVM / Relay 让 Graph Optimization 站 站 Compile | libtvm_hexagon_runtime.so Alios TVM @ Hexagon0 码力 | 27 页 | 4.86 MB | 5 月前3 TVM@AliOS1驱动万物智能 Alios TVM @ Hexagon DSP 人NiOS ! 驱动万物知 Tensorflow deploy.so / deploy.json / deploy.bin | NNVM / Relay 让 Graph Optimization 站 站 Compile | libtvm_hexagon_runtime.so Alios TVM @ Hexagon0 码力 | 27 页 | 4.86 MB | 5 月前3
 清华大学 DeepSeek+DeepResearch 让科研像聊天一样简单heatmap using this data? 创建一个热力图 Can you segment this data and create a table? 切分数据 Can you create a graph using this data? 制作一个图 Can you create a world cloud? 做一个词云 Can you create a chart using this data 空间。未来,随着技术的不断进步和创新,DeepSeek R1 可能会在以下几个方面实现进一步的突破: 通用能力提升 解决语言混杂问题 目前,DeepSeek R1在函数调用、多轮 对话、复杂角色扮演和 JSON 输出等任 务中的能力不及 DeepSeek-V3。未来, DeepSeek计划探索如何利用长推理链 来增强在这些任务的表现。 优化提示工程 目前模型对提示较为敏感,少样本提示会持续降0 码力 | 85 页 | 8.31 MB | 8 月前3 清华大学 DeepSeek+DeepResearch 让科研像聊天一样简单heatmap using this data? 创建一个热力图 Can you segment this data and create a table? 切分数据 Can you create a graph using this data? 制作一个图 Can you create a world cloud? 做一个词云 Can you create a chart using this data 空间。未来,随着技术的不断进步和创新,DeepSeek R1 可能会在以下几个方面实现进一步的突破: 通用能力提升 解决语言混杂问题 目前,DeepSeek R1在函数调用、多轮 对话、复杂角色扮演和 JSON 输出等任 务中的能力不及 DeepSeek-V3。未来, DeepSeek计划探索如何利用长推理链 来增强在这些任务的表现。 优化提示工程 目前模型对提示较为敏感,少样本提示会持续降0 码力 | 85 页 | 8.31 MB | 8 月前3
 TVM Meetup: Quantizationingests a FP32 graph and a small dataset • Finds suitable quantization scale • Produces a quantized graph • Compiling Pre-quantized models – QNN Dialect • TVM ingests a pre-quantized graph in TFLite or Affiliates. All rights reserved. TVM Overview Framework Graph Mxnet TF …. parsers Relay Graph Target-independent Relay passes Target-optimized graph Target-dependent Relay passes Intel x86 ARM CPU targets AutoTVM – Tuning the kernels Optimized Binary Codegen – LLVM, Cuda, C, … Framework Parsers Graph level optimizations Tensor-level optimizations Machine code generation© 2019, Amazon Web Services0 码力 | 19 页 | 489.50 KB | 5 月前3 TVM Meetup: Quantizationingests a FP32 graph and a small dataset • Finds suitable quantization scale • Produces a quantized graph • Compiling Pre-quantized models – QNN Dialect • TVM ingests a pre-quantized graph in TFLite or Affiliates. All rights reserved. TVM Overview Framework Graph Mxnet TF …. parsers Relay Graph Target-independent Relay passes Target-optimized graph Target-dependent Relay passes Intel x86 ARM CPU targets AutoTVM – Tuning the kernels Optimized Binary Codegen – LLVM, Cuda, C, … Framework Parsers Graph level optimizations Tensor-level optimizations Machine code generation© 2019, Amazon Web Services0 码力 | 19 页 | 489.50 KB | 5 月前3
 XDNN TVM - Nov 2019Runtime Image Model Weights Calibration Set Quantizer Compiler Tensor Graph Optimization Framework Tensor Graph to Xilinx Tensor Graph Frontend Deep Learning Frameworks https://github.com/xilinx© Copyright Copyright 2018 Xilinx TVM as Unified ML Front End >> 6 Relay (and NNVM) Graph Parser XIR Compiler Quantizer Partitioner @relay.transform.module_pass(opt_level=4) class AccelModule:© Copyright 2018 supported/not supported, pattern matching graph colorization - Choices how to partition especially for multi-branch networks (i.e. YOLOv3, SSD)© Copyright 2018 Xilinx TVM Graph Partitioning/Fusion >> 8 Subgraph0 码力 | 16 页 | 3.35 MB | 5 月前3 XDNN TVM - Nov 2019Runtime Image Model Weights Calibration Set Quantizer Compiler Tensor Graph Optimization Framework Tensor Graph to Xilinx Tensor Graph Frontend Deep Learning Frameworks https://github.com/xilinx© Copyright Copyright 2018 Xilinx TVM as Unified ML Front End >> 6 Relay (and NNVM) Graph Parser XIR Compiler Quantizer Partitioner @relay.transform.module_pass(opt_level=4) class AccelModule:© Copyright 2018 supported/not supported, pattern matching graph colorization - Choices how to partition especially for multi-branch networks (i.e. YOLOv3, SSD)© Copyright 2018 Xilinx TVM Graph Partitioning/Fusion >> 8 Subgraph0 码力 | 16 页 | 3.35 MB | 5 月前3
 Dynamic Model in TVMdependent: arange, nms, etc. ○ Control flow: concatenate within a while loop Limitation of TVM/graph runtime ● Cannot compile and run dynamic models© 2019, Amazon Web Services, Inc. or its Affiliates new runtime for Relay ● Dynamic codegen (WIP) ○ Kernel dispatch for a single op ○ Graph dispatch for a (sub-)graph In collaboration with Jared Roesch, Zhi Chen, Wei Chen© 2019, Amazon Web Services, implement© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dispatch a Whole Graph Resnet Data -> (Any, 3, 224, 224) Dispatch Tree Resnet_copy0 Resnet_copy1 ... 1 <= bs < 17 170 码力 | 24 页 | 417.46 KB | 5 月前3 Dynamic Model in TVMdependent: arange, nms, etc. ○ Control flow: concatenate within a while loop Limitation of TVM/graph runtime ● Cannot compile and run dynamic models© 2019, Amazon Web Services, Inc. or its Affiliates new runtime for Relay ● Dynamic codegen (WIP) ○ Kernel dispatch for a single op ○ Graph dispatch for a (sub-)graph In collaboration with Jared Roesch, Zhi Chen, Wei Chen© 2019, Amazon Web Services, implement© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dispatch a Whole Graph Resnet Data -> (Any, 3, 224, 224) Dispatch Tree Resnet_copy0 Resnet_copy1 ... 1 <= bs < 17 170 码力 | 24 页 | 417.46 KB | 5 月前3
 TVM@Alibaba AI LabsAlibaba Al.Labs 阿里巴巴人工智能实验室 PowerVR support by TVM NNVM Compiler -Execution graph -Model layers functions Computation Graph Optimizations -Param TvM Tensor Operators &0 码力 | 12 页 | 1.94 MB | 5 月前3 TVM@Alibaba AI LabsAlibaba Al.Labs 阿里巴巴人工智能实验室 PowerVR support by TVM NNVM Compiler -Execution graph -Model layers functions Computation Graph Optimizations -Param TvM Tensor Operators &0 码力 | 12 页 | 1.94 MB | 5 月前3
 TVM: Where Are We GoingASIC Optimization AutoTVM Device FleetExisting Deep Learning Frameworks High-level data flow graph Hardware Primitive Tensor operators such as Conv2D eg. cuDNN Offload to heavily optimized intensiveMachine Learning based Program Optimizer TVM: Learning-based Learning System High-level data flow graph and optimizations Directly generate optimized program for new operator workloads and hardware0 码力 | 31 页 | 22.64 MB | 5 月前3 TVM: Where Are We GoingASIC Optimization AutoTVM Device FleetExisting Deep Learning Frameworks High-level data flow graph Hardware Primitive Tensor operators such as Conv2D eg. cuDNN Offload to heavily optimized intensiveMachine Learning based Program Optimizer TVM: Learning-based Learning System High-level data flow graph and optimizations Directly generate optimized program for new operator workloads and hardware0 码力 | 31 页 | 22.64 MB | 5 月前3
 PAI & TVM Meetup - Shanghai 20191116PLATFORM COMPUTING PLATFORM INT8 Inference on PAI- 引FTe[= PAI-Blade Model Analysis Graph optimization Blade Graph Optimizer TensorRT Customized OptimizeT TAO Compiler (XLA) cuUBLAS/VcuDNNVCUTL, Blade0 码力 | 26 页 | 5.82 MB | 5 月前3 PAI & TVM Meetup - Shanghai 20191116PLATFORM COMPUTING PLATFORM INT8 Inference on PAI- 引FTe[= PAI-Blade Model Analysis Graph optimization Blade Graph Optimizer TensorRT Customized OptimizeT TAO Compiler (XLA) cuUBLAS/VcuDNNVCUTL, Blade0 码力 | 26 页 | 5.82 MB | 5 月前3
共 16 条
- 1
- 2














