Device Mapper - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

动手学深度学习 v2.0

function ones in module torch: ones(...) ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_ �→grad=False) -> Tensor Returns a tensor filled with the scalar value 1, with the strided. device (torch.device, optional): the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types. requires_grad (bool, optional): If autograd should record operations on the returned tensor. Default: False. Example::

0 码力 | 797 页 | 29.45 MB | 1 年前
3
AI大模型千问 qwen 中文文档

模型，其中包含 Qwen1. 5-7B-Chat 的实例： from transformers import AutoModelForCausalLM, AutoTokenizer device = "cuda" # the device to load the model onto # Now you do not need to add "trust_remote_code=True" model = = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-7B-Chat") # Instead of using model.chat() tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) # Directly use generate() and tokenizer.decode() to get the output. # Use `max_new_tokens` to control

0 码力 | 56 页 | 835.78 KB | 1 年前
3
Machine Learning Pytorch Tutorial

Tensors – Device ● Tensors & modules will be computed with CPU by default Use .to() to move tensors to appropriate devices. ● CPU x = x.to(‘cpu’) ● GPU x = x.to(‘cuda’) Tensors – Device (GPU) ● MyModel().to(device) criterion = nn.MSELoss() optimizer = torch.optim.SGD(model.parameters(), 0.1) read data via MyDataset put dataset into Dataloader construct model and move to device (cpu/cuda) model.train() for x, y in tr_set: optimizer.zero_grad() x, y = x.to(device), y.to(device) pred = model(x) loss = criterion(pred, y) loss.backward()

0 码力 | 48 页 | 584.86 KB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

expenditure on their data-centers, hence any efficiency gains are very significant. Enabling On-Device Deployment With the advent of smartphones, Internet-of-Things (IoT) devices (refer to Figure 1-5 hence there is a need for on-device ML models (where the model inference happens directly on the device). Which makes it imperative to optimize the models for the device they will run on. Privacy & Data lesser data-collection required. Similarly, enabling on-device models would imply that the model inference can be run completely on the user’s device without the need to send the input data to the server-side

0 码力 | 21 页 | 3.17 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

can be a bottleneck if the model is going to be deployed on-device (smartphones, IoT devices, etc.), where transmitting the model to the device is limited by the user’s bandwidth, and the memory available smaller vocabulary, and see if the resulting quality is within the acceptable parameters. For on-device models, TFLite offers post-training quantization as described in chapter 2. We could also incorporate shape (N, d). pQRNN demonstrated a model 140x smaller than an LSTM with pre-trained embeddings. An on-device friendly implementation of pQRNN is available in the Tensorflow repository here. We learnt about

0 码力 | 53 页 | 3.92 MB | 1 年前
3
【PyTorch深度学习-龙龙老师】-测试版202112

print(n, cpu_a.device, cpu_b.device) # 创建使用 GPU 运算的 2 个矩阵 gpu_a = torch.randn([1, n]).cuda() gpu_b = torch.randn([n, 1]).cuda() print(n, gpu_a.device, gpu_b.device) 接下来实现 CPU 和 print(x) # 打印 print(x.shape, x.device, x.dtype) # 打印形状和设备、精度 Out[2]: tensor([1.0000, 2.0000, 3.3000]) torch.Size([3]),cpu,torch.float32 其中 shape 属性表示张量的形状，device 属性代表了张量的设备名，dtype 属性表示张量的数值精度，张量中，默认使用按需分配显存方式，可以通过 torch.cuda.memory_allocated 函数获取目前已分配显存大小，代码如下： # 获取 GPU 0 的总显存 t = torch.cuda.get_device_properties(0).total_memory # 获取保留显存 r = torch.cuda.memory_reserved(0) # 获取已分配显存 a = torch

0 码力 | 439 页 | 29.91 MB | 1 年前
3
PyTorch OpenVINO 开发实战系列教程第一篇

in range(torch.cuda.device_count()): PyTorch + OpenVINO 开发实战系列教程第一篇 9 print(torch.cuda.get_device_name(i)) if gpu: print(x.cuda()) y = torch.tensor([1, 2, 3, 4], device="cuda:0") print("y: 1050 Ti tensor([[ 2., 3., 4., 12.], [ 3., 5., 8., 1.]], device='cuda:0') y: tensor([1, 2, 3, 4], device='cuda:0') 这里 x 默认是 CPU 类型数据，y 是直接创建的 GPU 类型数据。以上都是一些最基础跟使用频率较高的 Pytorch 基础操作，了

0 码力 | 13 页 | 5.99 MB | 1 年前
3
全连接神经网络实战. pytorch 版

来训练网络首先，我们先定义用来训练网络的设备： device = ’ cuda ’ i f torch . cuda . is_available () e l s e ’ cpu ’ print ( device ) #把网络模型移到 cuda 中 model = NeuralNetwork () . to ( device ) print ( model ) 如果 cuda model ’ + s t r (9) +’ . pth ’ checkpoint = torch . load ( path ) model2 = NeuralNetwork () . to ( device ) model2 . load_state_dict ( checkpoint [ ’ model ’ ] ) optimizer . load_state_dict ( checkpoint bias 会连接下一层的每个神经元，所以 bias 的 shape 是下一层神经元个数。调用也很简单，定义网络对象后直接调用即可： model = NeuralNetwork () . to ( device ) model . weight_init () 我们开始训练，发现第一个 epoch 训练的结果正确率就达到了 78%，而最终训练结果能达到百分之 81%。说明合理地初始化权重具有很重要的意义。

0 码力 | 29 页 | 1.40 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

home-automation device. Figure 3-4 shows the high level workflow of such a device. The model continuously classifies audio signals into one of the four classes, three of which are the keywords that the device will absence of an acceptable keyword in the input signal. Figure 3-4: Workflow of a home-automation device which detects three spoken words: hello weather and time. The output is none when none of the three

0 码力 | 56 页 | 18.93 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques

shown in table 2-1. Footprint Metrics Quality Metrics ● Model Size ● Inference Latency on Target Device ● Training Time for Convergence ● Peak RAM Consumption ● Accuracy ● Precision ● Recall ● F1 a model is useful if we want to deploy a model in a space constrained environment like a mobile device. To summarize, compression techniques help to achieve an efficient representation of a layer or or cheques using a deep learning system. We are targeting this system to run on a low end Android device. The resource limitations are under 50 Kb of model size and an upper limit of 1 millisecond per prediction

0 码力 | 33 页 | 1.96 MB | 1 年前
3

共 19 条前往

页

分类

语言

格式

动手学深度学习 v2.0

AI大模型千问 qwen 中文文档

Machine Learning Pytorch Tutorial

《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

【PyTorch深度学习-龙龙老师】-测试版202112

PyTorch OpenVINO 开发实战系列教程第一篇

全连接神经网络实战. pytorch 版

《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques