memory model - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

PyTorch Release Notes

multi-threaded data loaders, the default shared memory segment size with which the container runs might not be enough. Therefore, you should increase the shared memory size by issuing one of the following commands: commands: ‣ --ipc=host ‣ --shm-size=memory size> in the command line to docker run --gpus all To pull data and model descriptions from locations outside the container for use by PyTorch or (FP8) precision on Hopper GPUs which provides better training and inference performance with lower memory utilization. Transformer Engine also includes a collection of highly optimized modules for popular

0 码力 | 365 页 | 2.94 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

in ANALOG magazine (1991) So far, we have discussed generic techniques which are agnostic to the model architecture. These techniques can be applied in NLP, vision, speech or other domains. However, owing challenges. What good is a model that cannot be deployed in practical applications! Efficient Architectures aim to improve model deployability by proposing novel ways to reduce model footprint and improve running on mobile and edge devices. We have also set up a couple of programming projects for a hands-on model optimization experience using these efficient layers and architectures. Let’s start our journey with

0 码力 | 53 页 | 3.92 MB | 1 年前
3
AI大模型千问 qwen 中文文档

Qwen Team 2024 年 05 月 11 日快速开始 1 文档 3 i ii Qwen Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Now the large language models have been upgraded AutoModelForCausalLM, AutoTokenizer device = "cuda" # the device to load the model onto # Now you do not need to add "trust_remote_code=True" model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-7B-Chat") # Instead of using model.chat(), we directly use model.generate() # But you need to use tokenizer.apply_chat_template() to format your inputs

0 码力 | 56 页 | 835.78 KB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

learning algorithms help build models, which as the name suggests is an approximate mathematical model of what outputs correspond to a given input. To illustrate, when you visit Netflix’s homepage, the might be popular with other users too. If we train a model to predict the probability based on your behavior and currently trending content, the model will assign a high probability to Seinfeld. While there the performance of the model scaled well with the number of labeled examples, since the network had a large number of parameters. Thus to extract the most out of the setup, the model needed a large number

0 码力 | 21 页 | 3.17 MB | 1 年前
3
【PyTorch深度学习-龙龙老师】-测试版202112

参考文献第 15 章自定义数据集 15.1 精灵宝可梦数据集 15.2 自定义数据集加载流程 15.3 宝可梦数据集实战 15.4 迁移学习 15.5 Saved_model 15.6 模型部署 15.7 参考文献预览版202112 人工智能绪论我们需要的是一台可以从经验中学习的机器。 −阿兰·图灵 1.1 容器可以非常方便地搭建多层的网络。对于 3 层网络，我们可以通过快速完成 3 层网络的搭建。 # 利用 Sequential 容器封装 3 个网络层，前网络层的输出默认作为下一层的输入 model = nn.Sequential( # 创建第一层，输入为 784，输出为 256 nn.Linear(28*28, 256), nn.ReLU(), # 激活函数 ) 第 1 层的输出节点数设计为 256，第 2 层设计为 128，输出层节点数设计为 10。直接调用这个模型对象 model(x)就可以返回模型最后一层的输出?。 3.8.2 模型训练搭建完成 3 层神经网络的对象后，给定输入?，调用 model(?)得到模型输出?后，通过 F.mse_loss 损失函数计算当前的误差ℒ： # 创建优化器，并传递需要优化的参数列表：[w1

0 码力 | 439 页 | 29.91 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques

compression techniques. Compression techniques aim to reduce the model footprint (size, latency, memory etc.). We can reduce the model footprint by reducing the number of trainable parameters. However requires many trials and evaluations to reach a smaller model, if it is at all possible. Second, such an approach doesn’t generalize well because the model designs are subjective to the specific problem. In In this chapter, we introduce Quantization, a model compression technique that addresses both these issues. We’ll start with a gentle introduction to the idea of compression. Details of quantization and

0 码力 | 33 页 | 1.96 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniques

Can we optimally prune the network connections, remove extraneous nodes, etc. while retaining the model’s performance? In this chapter we introduce the intuition behind sparsity, different possible methods methods of picking the connections and nodes to prune, and how to prune a given deep learning model to achieve storage and latency gains with a minimal performance tradeoff. Next, the chapter goes over weight learn about these techniques together! Model Compression Using Sparsity Sparsity or Pruning refers to the technique of removing (pruning) weights during the model training to achieve smaller models. Such

0 码力 | 34 页 | 3.18 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

you'll go.” ― Dr. Seuss Model quality is an important benchmark to evaluate the performance of a deep learning model. A language translation application that uses a low quality model would struggle with consumer effectively with others who speak different languages. An application that employs a high quality model with a reasonable translation accuracy would garner better consumer support. In this chapter, our picked to benchmark learning techniques. It is followed by a short discussion on exchanging model quality and model footprint. An in-depth discussion of data augmentation and distillation follows right after

0 码力 | 56 页 | 18.93 MB | 1 年前
3
亚马逊AWSAI Services Overview

内存 (内存存取带宽达到240 GB/秒), 以及 2,496 个并行处理核心 Instance Name GPU Count vCPU Count Memory Parallel Processing Cores GPU Memory Network Performance p2.xlarge 1 4 61 GiB 2,496 12 GiB High p2.8xlarge 8 32 Understanding Book Flight London Utterances Flight booking London Heathrow Intent / Slot model London Heathrow Origin Destination Departure Date Flight Booking “Book a flight to London” Understanding Book Flight London Utterances Flight booking London Heathrow Intent / Slot model London Heathrow Location Location Seattle Origin Destination Departure Date Flight Booking

0 码力 | 56 页 | 4.97 MB | 1 年前
3
动手学深度学习 v2.0

定当下的“最佳参数集”，这些参数通过某种性能度量方式来达到完成任务的最佳性能。那么到底什么是参数呢？参数可以被看作旋钮，旋钮的转动可以调整程序的行为。任一调整参数后的程序被称为模型（model）。通过操作参数而生成的所有不同程序（输入‐输出映射）的集合称为“模型族”。使用数据集来选择参数的元程序被称为学习算法（learning algorithm）。在开始用机器学习算法解决问题进行更详细的解析。 1.2 机器学习中的关键组件首先介绍一些核心组件。无论什么类型的机器学习问题，都会遇到这些组件： 1. 可以用来学习的数据（data）； 2. 如何转换数据的模型（model）； 3. 一个目标函数（objective function），用来量化模型的有效性； 4. 调整模型参数以优化目标函数的算法（algorithm）。 1.2. 机器学习中的关键组件 19 b。无论我们使用什么手段来观察特征X和标签y，都可能会出现少量的观测误差。因此，即使确信特征与标签的潜在关系是线性的，我们也会加入一个噪声项来考虑观测误差带来的影响。在开始寻找最好的模型参数（model parameters）w和b之前，我们还需要两个东西：（1）一种模型质量的度量方式；（2）一种能够更新模型以提高模型预测质量的方法。损失函数在我们开始考虑如何用模型拟合（fit）数据

0 码力 | 797 页 | 29.45 MB | 1 年前
3

共 69 条前往

页

分类

语言

格式

PyTorch Release Notes

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

AI大模型千问 qwen 中文文档

《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

【PyTorch深度学习-龙龙老师】-测试版202112

《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques

《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniques

《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

亚马逊AWSAI Services Overview

动手学深度学习 v2.0