 阿里云上深度学习建模实践-程孟力从FM到DeepFM rt 增加了10 倍怎么优化? 深度学习应用主要的挑战: 2.模型效果优 化困难 1.方案复杂  训练优化:  数据并行  模型并行  推理优化: Blade  推荐模型优化: 千亿特征 3. 工程优化 RingAllReduce + 层级级联 EasyVision 多机多卡性能对比 工程优化: 数据并行  M6模型  Transformer模型: RapidFormer性能 工程优化: 模型并行(Whale)  FP16 / Int8  模型剪枝  Op融合(Fusion Stitch)  MILR: Blade Disc 工程优化: Blade模型推理 Dynamic Shape Compiler for Machine Learning Workloads EmbeddingVariable [No Hash 病预测等) Infrastructure PAI平台(Platform of Artificial Intelligence) • 一键部署、弹性扩缩 • 多框架、多语言 • 推理优化Blade • 多维度监控+报警 • 自定义镜像 • 全托管+半托管 • 分布式训练优化 • 超大资源池 智能标注 可视化建模(Designer) 分布式训练(DLC) 在线服务(EAS)0 码力 | 40 页 | 8.51 MB | 1 年前3 阿里云上深度学习建模实践-程孟力从FM到DeepFM rt 增加了10 倍怎么优化? 深度学习应用主要的挑战: 2.模型效果优 化困难 1.方案复杂  训练优化:  数据并行  模型并行  推理优化: Blade  推荐模型优化: 千亿特征 3. 工程优化 RingAllReduce + 层级级联 EasyVision 多机多卡性能对比 工程优化: 数据并行  M6模型  Transformer模型: RapidFormer性能 工程优化: 模型并行(Whale)  FP16 / Int8  模型剪枝  Op融合(Fusion Stitch)  MILR: Blade Disc 工程优化: Blade模型推理 Dynamic Shape Compiler for Machine Learning Workloads EmbeddingVariable [No Hash 病预测等) Infrastructure PAI平台(Platform of Artificial Intelligence) • 一键部署、弹性扩缩 • 多框架、多语言 • 推理优化Blade • 多维度监控+报警 • 自定义镜像 • 全托管+半托管 • 分布式训练优化 • 超大资源池 智能标注 可视化建模(Designer) 分布式训练(DLC) 在线服务(EAS)0 码力 | 40 页 | 8.51 MB | 1 年前3
 PyTorch Release Noteswith GPU support for NGC containers, when you run a container, the following occurs: ‣ The Docker engine loads the image into a container which runs the software. ‣ You define the runtime resources of Deep Learning Framework containers are no longer tested on Pascal GPU architectures. ‣ Transformer Engine is a library for accelerating Transformer models on NVIDIA GPUs. It includes support for 8-bit floating which provides better training and inference performance with lower memory utilization. Transformer Engine also includes a collection of highly optimized modules for popular Transformer architectures and0 码力 | 365 页 | 2.94 MB | 1 年前3 PyTorch Release Noteswith GPU support for NGC containers, when you run a container, the following occurs: ‣ The Docker engine loads the image into a container which runs the software. ‣ You define the runtime resources of Deep Learning Framework containers are no longer tested on Pascal GPU architectures. ‣ Transformer Engine is a library for accelerating Transformer models on NVIDIA GPUs. It includes support for 8-bit floating which provides better training and inference performance with lower memory utilization. Transformer Engine also includes a collection of highly optimized modules for popular Transformer architectures and0 码力 | 365 页 | 2.94 MB | 1 年前3
 QCon北京2018-《从键盘输入到神经网络--深度学习在彭博的应用》-李碧野wikimedia.org/wiki/File:Nvidia_logo.svg and https://commons.wikimedia.org/wiki/File:Docker_(container_engine)_l ogo.png May be re-distributed in accordance with the terms of the CC-SA 4.0 license https://creativecommons org/wikipedia/commons/6/67/Kubernetes_logo.svg and https://commons.wikimedia.org/wiki/File:Docker_(contai ner_engine) _logo.png May be re-distributed in accordance with the terms of the CC-SA 4.0 license https://creativecommons0 码力 | 64 页 | 13.45 MB | 1 年前3 QCon北京2018-《从键盘输入到神经网络--深度学习在彭博的应用》-李碧野wikimedia.org/wiki/File:Nvidia_logo.svg and https://commons.wikimedia.org/wiki/File:Docker_(container_engine)_l ogo.png May be re-distributed in accordance with the terms of the CC-SA 4.0 license https://creativecommons org/wikipedia/commons/6/67/Kubernetes_logo.svg and https://commons.wikimedia.org/wiki/File:Docker_(contai ner_engine) _logo.png May be re-distributed in accordance with the terms of the CC-SA 4.0 license https://creativecommons0 码力 | 64 页 | 13.45 MB | 1 年前3
 超大规模深度学习在美团的应用-余建平提供系统的平台化工具,为用户提供易用的界面操作; MLX模型能力 MLX平台架构 MLX平台架构 • 基于Worker + PS架构搭建 • Worker  模型计算引擎(Engine)  计算图框架(Graph) • 模型计算引擎Engine  模型结构处理  与PS通信交换模型参数  计算图的计算 • 计算图框架Graph  计算逻辑抽象op,通过op组合形成模型结构  提供正0 码力 | 41 页 | 5.96 MB | 1 年前3 超大规模深度学习在美团的应用-余建平提供系统的平台化工具,为用户提供易用的界面操作; MLX模型能力 MLX平台架构 MLX平台架构 • 基于Worker + PS架构搭建 • Worker  模型计算引擎(Engine)  计算图框架(Graph) • 模型计算引擎Engine  模型结构处理  与PS通信交换模型参数  计算图的计算 • 计算图框架Graph  计算逻辑抽象op,通过op组合形成模型结构  提供正0 码力 | 41 页 | 5.96 MB | 1 年前3
 AI大模型千问 qwen 中文文档1.15.4 检索增强(RAG) 现在您可以输入查询,Qwen1.5 将基于索引文档的内容提供答案。 query_engine = index.as_query_engine() your_query = " AI大模型千问 qwen 中文文档1.15.4 检索增强(RAG) 现在您可以输入查询,Qwen1.5 将基于索引文档的内容提供答案。 query_engine = index.as_query_engine() your_query = "- " print(query_engine.query(your_query).response) 1.16 Langchain This guide helps 0 码力 | 56 页 | 835.78 KB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquessparsifying activation maps to produce robust models. Rhu et al., through their work on Compression DMA Engine12, observed that a non-trivial fraction of activation values for ReLU activation function are naturally International Conference on Machine Learning. PMLR, 2020. 12 Rhu, Minsoo, et al. "Compressing DMA engine: Leveraging activation sparsity for training deep neural networks." 2018 IEEE International Symposium0 码力 | 34 页 | 3.18 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquessparsifying activation maps to produce robust models. Rhu et al., through their work on Compression DMA Engine12, observed that a non-trivial fraction of activation values for ReLU activation function are naturally International Conference on Machine Learning. PMLR, 2020. 12 Rhu, Minsoo, et al. "Compressing DMA engine: Leveraging activation sparsity for training deep neural networks." 2018 IEEE International Symposium0 码力 | 34 页 | 3.18 MB | 1 年前3
 亚马逊AWSAI Services Overviewdetection, sequence matching, regression analysis, network/tribe analysis Netflix • Recommendation engine Pinterest • Image recognition search Fraud.net • Detect online payment fraud DataXu • Leverage0 码力 | 56 页 | 4.97 MB | 1 年前3 亚马逊AWSAI Services Overviewdetection, sequence matching, regression analysis, network/tribe analysis Netflix • Recommendation engine Pinterest • Image recognition search Fraud.net • Detect online payment fraud DataXu • Leverage0 码力 | 56 页 | 4.97 MB | 1 年前3
 keras tutorialRecurrent neural networks(RNN). It is defined as shown below: Keras 49 keras.engine.base_layer.wrapped_fn() It supports the following parameters:  cell refers an instance. 0 码力 | 98 页 | 1.57 MB | 1 年前3 keras tutorialRecurrent neural networks(RNN). It is defined as shown below: Keras 49 keras.engine.base_layer.wrapped_fn() It supports the following parameters:  cell refers an instance. 0 码力 | 98 页 | 1.57 MB | 1 年前3
 Keras: 基于 Python 的深度学习库32) model.add(Flatten()) # 现在:model.output_shape == (None, 65536) 5.2.5 Input [source] keras.engine.topology.Input() Input() 用于实例化 Keras 张量。 Keras 张量是底层后端 (Theano, TensorFlow or CNTK) 的张量对象,我们增加了一些特性,使 如果你的层更改了输入张量的形状,你应该在这 里定义形状变化的逻辑,这让 Keras 能够自动推断各层的形状。 from keras import backend as K from keras.engine.topology import Layer import numpy as np class MyLayer(Layer): def __init__(self, output_dim,0 码力 | 257 页 | 1.19 MB | 1 年前3 Keras: 基于 Python 的深度学习库32) model.add(Flatten()) # 现在:model.output_shape == (None, 65536) 5.2.5 Input [source] keras.engine.topology.Input() Input() 用于实例化 Keras 张量。 Keras 张量是底层后端 (Theano, TensorFlow or CNTK) 的张量对象,我们增加了一些特性,使 如果你的层更改了输入张量的形状,你应该在这 里定义形状变化的逻辑,这让 Keras 能够自动推断各层的形状。 from keras import backend as K from keras.engine.topology import Layer import numpy as np class MyLayer(Layer): def __init__(self, output_dim,0 码力 | 257 页 | 1.19 MB | 1 年前3
共 9 条
- 1













