 PAI & TVM Meetup - Shanghai 20191116计算平台事业部 。TensorCore AutoCodeGen in TVM “。FP16 Mixed-Precision Training on PAI 。INT8 Inference on PAI-Blade 计算平台事业部 COMPUTING PLATFORM TensorCore AutoCodeGen Background 计算平台事业 Inference on PAI- 引FTe[= PAI-Blade Model Analysis Graph optimization Blade Graph Optimizer TensorRT Customized OptimizeT TAO Compiler (XLA) cuUBLAS/VcuDNNVCUTL, Blade Kernel Lib S, ation 计算平台事业部0 码力 | 26 页 | 5.82 MB | 5 月前3 PAI & TVM Meetup - Shanghai 20191116计算平台事业部 。TensorCore AutoCodeGen in TVM “。FP16 Mixed-Precision Training on PAI 。INT8 Inference on PAI-Blade 计算平台事业部 COMPUTING PLATFORM TensorCore AutoCodeGen Background 计算平台事业 Inference on PAI- 引FTe[= PAI-Blade Model Analysis Graph optimization Blade Graph Optimizer TensorRT Customized OptimizeT TAO Compiler (XLA) cuUBLAS/VcuDNNVCUTL, Blade Kernel Lib S, ation 计算平台事业部0 码力 | 26 页 | 5.82 MB | 5 月前3
 Bring Your Own Codegen to TVMrights reserved. Amazon/Intel Confidentia Presenter: Zhi Chen, Cody Yu Amazon SageMaker Neo, Deep Engine Science Bring Your Own Codegen to TVM AWS AI© 2019, Amazon Web Services, Inc. or its Affiliates Implement extern operator functions, OR 2. Implement a graph annotator Generate binary/library/engine for the subgraph ● Implement an IR visitor for codegen ● Implement the build logic© 2019, Amazon Implement the Codegen ● Implement a codegen class to accept subgraphs and build binary/library/engine for runtime dispatching ● Codegen path: src/relay/backend/contrib/ Bring Your Own Codegen to TVMrights reserved. Amazon/Intel Confidentia Presenter: Zhi Chen, Cody Yu Amazon SageMaker Neo, Deep Engine Science Bring Your Own Codegen to TVM AWS AI© 2019, Amazon Web Services, Inc. or its Affiliates Implement extern operator functions, OR 2. Implement a graph annotator Generate binary/library/engine for the subgraph ● Implement an IR visitor for codegen ● Implement the build logic© 2019, Amazon Implement the Codegen ● Implement a codegen class to accept subgraphs and build binary/library/engine for runtime dispatching ● Codegen path: src/relay/backend/contrib/- /codegen.cc 0 码力 | 19 页 | 504.69 KB | 5 月前3
 Trends Artificial Intelligence
Telegraph Electrification Mass Steel Production Mass Production & Assembly Lines Internal Combustion Engine Flight Synthetic Fertilizer Transistors PCs Internet Smartphones Cloud12 …Technology Compounding at the center of the AI hardware stack. Its GPUs (graphics processing units) became the default engine for training and inference, prized for their ability to handle highly parallel computations at Perplexity Nears Funding at $14 Billion Value’ (5/25) (link) Perplexity is best described as an answer engine. You ask it a question, you get an answer. Except the difference is, all the answers are backed0 码力 | 340 页 | 12.14 MB | 4 月前3 Trends Artificial Intelligence
Telegraph Electrification Mass Steel Production Mass Production & Assembly Lines Internal Combustion Engine Flight Synthetic Fertilizer Transistors PCs Internet Smartphones Cloud12 …Technology Compounding at the center of the AI hardware stack. Its GPUs (graphics processing units) became the default engine for training and inference, prized for their ability to handle highly parallel computations at Perplexity Nears Funding at $14 Billion Value’ (5/25) (link) Perplexity is best described as an answer engine. You ask it a question, you get an answer. Except the difference is, all the answers are backed0 码力 | 340 页 | 12.14 MB | 4 月前3
 TVM@AliOSFacelD Multimodal Interection CPU (ARM、Intel) 1驱动万物智能 Accelerated Op Library / Others Inference Engine DSP (Qualcomm) PART TWO Alios TVM @ ARM CPU AiOS 1驱动万物智能 Alios TVMQOARM CPU 。 Support TFLite 1024 1024, 1024 PART Five Misc AiOS 1驱动万物智能 M Nvidia GTX 1050 。, Integrate other inference engine (like TRT) 2 _ _ 10 90 码力 | 27 页 | 4.86 MB | 5 月前3 TVM@AliOSFacelD Multimodal Interection CPU (ARM、Intel) 1驱动万物智能 Accelerated Op Library / Others Inference Engine DSP (Qualcomm) PART TWO Alios TVM @ ARM CPU AiOS 1驱动万物智能 Alios TVMQOARM CPU 。 Support TFLite 1024 1024, 1024 PART Five Misc AiOS 1驱动万物智能 M Nvidia GTX 1050 。, Integrate other inference engine (like TRT) 2 _ _ 10 90 码力 | 27 页 | 4.86 MB | 5 月前3
 Dynamic Model in TVMits Affiliates. All rights reserved. Presenter: Haichen Shen, Yao Wang Amazon SageMaker Neo, Deep Engine Science Dynamic Model in TVM AWS AI© 2019, Amazon Web Services, Inc. or its Affiliates. All rights0 码力 | 24 页 | 417.46 KB | 5 月前3 Dynamic Model in TVMits Affiliates. All rights reserved. Presenter: Haichen Shen, Yao Wang Amazon SageMaker Neo, Deep Engine Science Dynamic Model in TVM AWS AI© 2019, Amazon Web Services, Inc. or its Affiliates. All rights0 码力 | 24 页 | 417.46 KB | 5 月前3
 OpenAI - AI in the Enterpriseprevious work experience makes the job a good fit. The Indeed team tested the previous job matching engine against the GPT-powered version with the new, customized context. The performance uplift was significant:0 码力 | 25 页 | 9.48 MB | 5 月前3 OpenAI - AI in the Enterpriseprevious work experience makes the job a good fit. The Indeed team tested the previous job matching engine against the GPT-powered version with the new, customized context. The performance uplift was significant:0 码力 | 25 页 | 9.48 MB | 5 月前3
 清华大学第二弹:DeepSeek赋能职场•功能范围 •专业技能 •决策权限 约束层: 3. 边界系统 (Boundary System) •伦理规范 •安全限制 •资源约束 操作层: 4. 工作引擎 (Operation Engine) •输入处理 •执行流程 •输出规范 如何使用DeepSeek制作可视化图表? 如何使用DeepSeek制作可视化图表? 角色: Mermaid图表代码生成器 功能: 根据用0 码力 | 35 页 | 9.78 MB | 8 月前3 清华大学第二弹:DeepSeek赋能职场•功能范围 •专业技能 •决策权限 约束层: 3. 边界系统 (Boundary System) •伦理规范 •安全限制 •资源约束 操作层: 4. 工作引擎 (Operation Engine) •输入处理 •执行流程 •输出规范 如何使用DeepSeek制作可视化图表? 如何使用DeepSeek制作可视化图表? 角色: Mermaid图表代码生成器 功能: 根据用0 码力 | 35 页 | 9.78 MB | 8 月前3
 OpenAI 《A practical guide to building agents》rule-based approaches fall short. Consider the example of payment fraud analysis. A traditional rules engine works like a checklist, flagging transactions based on preset criteria. In contrast, an LLM agent0 码力 | 34 页 | 7.00 MB | 6 月前3 OpenAI 《A practical guide to building agents》rule-based approaches fall short. Consider the example of payment fraud analysis. A traditional rules engine works like a checklist, flagging transactions based on preset criteria. In contrast, an LLM agent0 码力 | 34 页 | 7.00 MB | 6 月前3
 Google 《Prompt Engineering v7》can face while crafting prompts. Prompt engineering Remember how an LLM works; it’s a prediction engine. The model takes sequential text as an input and then predicts what the following token should be0 码力 | 68 页 | 6.50 MB | 6 月前3 Google 《Prompt Engineering v7》can face while crafting prompts. Prompt engineering Remember how an LLM works; it’s a prediction engine. The model takes sequential text as an input and then predicts what the following token should be0 码力 | 68 页 | 6.50 MB | 6 月前3
 DeepSeek-V2: A Strong, Economical, and Efficient
Mixture-of-Experts Language Modelthis goal, we implement the following engineering optimizations. (1) Firstly, we propose a hybrid engine that adopts different parallel strategies for training and inference respectively to achieve higher0 码力 | 52 页 | 1.23 MB | 1 年前3 DeepSeek-V2: A Strong, Economical, and Efficient
Mixture-of-Experts Language Modelthis goal, we implement the following engineering optimizations. (1) Firstly, we propose a hybrid engine that adopts different parallel strategies for training and inference respectively to achieve higher0 码力 | 52 页 | 1.23 MB | 1 年前3
共 10 条
- 1













