动手学深度学习 v2.0import time import zipfile from collections import defaultdict import pandas as pd import requests from IPython import display from matplotlib import pyplot as plt from matplotlib_inline import backend_inline page) 目录 5 (continued from previous page) import torchvision from PIL import Image from torch import nn from torch.nn import functional as F from torch.utils import data from torchvision import transforms Miniconda3-py39_4.12.0-MacOSX-x86_64.sh -b 如果我们使用Linux,假设Python版本是3.9(我们的测试版本),将下载名称包含字符串“Linux”的bash脚 本,并执行以下操作: # 文件名可能会更改 sh Miniconda3-py39_4.12.0-Linux-x86_64.sh -b 接下来,初始化终端Shell,以便我们可以直接运行conda。0 码力 | 797 页 | 29.45 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewnew models to be trained from scratch. For models that share the same domain, it is likely that the first few layers learn similar features. Hence training new models from scratch for these tasks is likely amount of labeled data required is large too. For the second limitation, training large models from scratch for every slightly different task is not efficient either. In many cases we might be limited by meaning. However, “The croissant was too sweet” should have a much different representation that is far from both the former sentences. Now notice how such a model would be useful across many different tasks0 码力 | 31 页 | 4.03 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquesof precision to match the distribution of the data, which ensures the decoded value deviates less from the original value and can help improve the quality of our models. Did we get you excited yet? Let’s import gzip import operator, random import numpy as np import tensorflow as tf from functools import reduce from matplotlib import pyplot as plt We define two functions sparsify_smallest() and compress() that we can try? Let's introduce the concept of saliency scores to abstract the pruning strategies from the pruning process. The saliency scores are the scores assigned to the weights (edges / nodes to0 码力 | 34 页 | 3.18 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionmodel A since it generates the prediction faster. Similarly, if you are training a large model from scratch on either with limited or costly training resources, developing models that are designed for Training deploy pareto-optimal models that simply cost less resources to train and/or deploy. This means going from the red dots in Figure 3 to the green dots on the pareto-frontier. Having such a toolbox to make will primarily focus on efficiency for both training and deploying efficient deep learning models from large servers to tiny microcontrollers. Let us start building a mental model of efficient deep learning0 码力 | 21 页 | 3.17 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationparameters that influence the process of learning are called hyperparameters to differentiate them from model parameters. The performance of deep learning relies on a set of good hyperparameters. Some of assigned to one of the five target classes. import random import tensorflow as tf import numpy as np from tensorflow.keras import layers, losses, optimizers X = tf.random.uniform((20, 5)) Y = tf.squeeze( hyperparameter trial set is ready. Let's go ahead and train the model, each time choosing one item from the trial set. Each model is trained for 2000 iterations. At the end of a trial, we record the minimum0 码力 | 33 页 | 2.48 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquesright after. Following the lead from the previous chapters, the theory is complemented with programming projects to assist readers to implement these techniques from scratch. Our journey of learning techniques techniques. While data augmentation is concerned with samples and labels, distillation transfers knowledge from a large model or ensemble of models to smaller models. The obvious question at this point is: why All cups have the same basic shape. One possible way to teach a child is to look at the same cup from different angles and rotations, in varying degrees of light. The same process can be repeated for0 码力 | 56 页 | 18.93 MB | 1 年前3
深度学习与PyTorch入门实战 - 63. 迁移学习-自定义数据集实战steps ▪ Load data ▪ Build model ▪ Train and Test ▪ Transfer Learning Step1.Load data ▪ Inherit from torch.utils.data.Dataset ▪ __len__ ▪ __getitem__ Custom Dataset Preprocessing ▪ Image Resize Argumentation ▪ Rotate ▪ Crop ▪ Normalize ▪ Mean, std ▪ ToTensor Step2.build model ▪ Inherit from base class ▪ Define forward graph Step3.Train and Test Step4.Transfer learning https://slideplayer io/blog/exploring-computer-vision-transfer-learning/ In Conclusion ▪ Load custom data ▪ Train from scratch ▪ Transfer learning 下一课时 Thank You.0 码力 | 16 页 | 719.15 KB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturestechnology is indistinguishable from magic.” — Arthur C. Clarke, “Hazards of Prophecy: The Failure of Imagination” (1962) “Any technology that is distinguishable from magic is insufficiently advanced enabled learning spatial features in the input. Recurrent Neural Nets (RNNs) facilitated learning from the sequences and temporal data. These breakthroughs contributed to bigger and bigger models. Although random animal like a chimp. Similarly, we know that we should maintain our distance from a snake, and definitely from a grizzly bear, if we ever accidentally cross paths. We build an associative memory0 码力 | 53 页 | 3.92 MB | 1 年前3
PyTorch Release NotesRN-08516-001_v23.07 | 2 Chapter 2. Pulling A Container About this task Before you can pull a container from the NGC container registry: ‣ Install Docker. ‣ For NVIDIA DGX™ users, see Preparing to use NVIDIA d memory size> in the command line to docker run --gpus all To pull data and model descriptions from locations outside the container for use by PyTorch or save results to locations outside the container The CUDA driver's compatibility package only supports particular drivers. Thus, users should upgrade from all R418, R440, R460, and R520 drivers, which are not forward- compatible with CUDA 12.1. For a complete0 码力 | 365 页 | 2.94 MB | 1 年前3
AI大模型千问 qwen 中文文档进行推理。请确保已安装了 transformers>=4. 37.0 版本。以下是一个非常简单的代码片段示例,展示如何运行 Qwen1.5-Chat 模型,其中包含 Qwen1. 5-7B-Chat 的实例: from transformers import AutoModelForCausalLM, AutoTokenizer device = "cuda" # the device to load the model "trust_remote_code=True" model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-7B-Chat") # Instead AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto", device_map="auto", attn_implementation="flash_attention_2", ) 为了解决下载问题,我们建议您尝试从 ModelScope 进行下载,只需将上述代码的第一行更改为以下内容: from modelscope0 码力 | 56 页 | 835.78 KB | 1 年前3
共 60 条
- 1
- 2
- 3
- 4
- 5
- 6













