Zero Trust - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

AI大模型千问 qwen 中文文档

AutoTokenizer device = "cuda" # the device to load the model onto # Now you do not need to add "trust_remote_code=True" model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto" AutoTokenizer device = "cuda" # the device to load the model onto # Now you do not need to add "trust_remote_code=True" model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen1.5-7B-Chat", torch_dtype="auto" quantization model_path = "your_model_path" quant_path = "your_quantized_model_path" quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM �→" } # Load your tokenizer and

0 码力 | 56 页 | 835.78 KB | 1 年前
3
【PyTorch深度学习-龙龙老师】-测试版202112

神经网络到强化学习领域，提出了 DQN 算法，在 Atari 游戏平台中的 49 个游戏上取得了与人类相当甚至超越人类的水平；在围棋领域，DeepMind 提出的 AlphaGo 和 AlphaGo Zero 智能程序相继打败人类顶级围棋专家李世石、柯洁等；在多智能体协作的 Dota2 游戏平台，OpenAI 开发的 OpenAI Five 智能程序在受限游戏环境中打败了 TI8 冠军队伍 OG 队，展现出了大量专业级的高层智能操作。图 DBN深度置信网络 ImageNet 2009 2012 AlexNet 提出 GAN生成对抗网络 2014 2015 DQN AlphaGO 2016 2017 AlphaGO Zero 2019 OpenAI Five ResNet 2015 2014 VGG GooLeNet 2015 Batch Normalization 德州扑克 Pluribus 2019 机器翻译上串行训练即可得到满意结果。但是深度学习非常依赖并行加速计算设备，目前的大部分神经网络均使用 NVIDIA GPU 和 Google TPU 等并行加速芯片训练模型参数。如围棋程序 AlphaGo Zero 在 64 块 GPU 上从零开始训练了 40 天才得以超越所有的 AlphaGo 历史版本；自动网络结构搜索算法使用了 800 块 GPU 同时训练才能优化出较好的网络结构。目前普通消费

0 码力 | 439 页 | 29.91 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniques

form of pruning is to zero out a certain, say p, percentage of the smallest absolute valued weights in each training epoch. The result of such a training process is p% weights with zero values. Sparse compressed have fewer connections. Let's do an exercise to convince ourselves that setting parameter values to zero indeed results in a higher compression ratio. Figure 5-1: An illustration of pruning weights (connections) compress(). The sparsify_smallest() sets the absolute smallest weights in the input weight matrix to zero. The number of the absolute smallest weights is computed based on the sparsity_rate parameter which

0 码力 | 34 页 | 3.18 MB | 1 年前
3
Lecture 6: Support Vector Machine

For now, assume that entire training data are correctly classified by (ω, b) Zero loss on the training examples (non-zero loss later) Feng Li (SDU) SVM December 28, 2021 4 / 82 Margin Hyperplane: ωTx Karush-Kuhn-Tucker (KKT) Conditions Let ω∗ and (α∗, β ∗) by any primal and dual optimal points wither zero duality gap (i.e., the strong duality holds), the following conditions should be satisfied Stationarity: (Contd.) Most αi’s in the solution are zero (sparse solution) According to KKT conditions, for the optimal αi’s, αi � 1 − y (i)(ωTx(i) + b) � = 0 αi is non-zero only if x(i) lies on the one of the two

0 码力 | 82 页 | 773.97 KB | 1 年前
3
全连接神经网络实战. pytorch 版

and l o s s pred = model (X) l o s s = loss_function ( pred , y) # Backpropagation optimizer . zero_grad () #梯度归0w l o s s . backward () optimizer . step () i f batch % 100 == 0: loss , current shape ) #权重分布符合正态分布 m. weight . data . normal_ ( 0 . 0 , 1) #偏置归 0 m. bias . data . zero_ () Chapter 3. 更完善的神经网络 17 注意 bias 是权重，因为当前层的 bias 会连接下一层的每个神经元，所以 bias 的 shape 是下一层神经元个数。调用也很简单，定义网络对象后直接调用即可： (m, nn . Linear ) : m. weight . data . normal_ ( 0 . 0 , 1.0)#. f i l l _ (0.05) m. bias . data . zero_ () def forward ( s e l f , x) : #x = s e l f . f l a t t e n (x) l o g i t s = s e l f . linear_relu_stack

0 码力 | 29 页 | 1.40 MB | 1 年前
3
pytorch 入门笔记-03- 神经网络

-0.0056, -0.0597, 0.0184, -0.0300]], grad_fn=) 将所有参数的梯度缓存清零，然后进行随机梯度的的反向传播： net.zero_grad() out.backward(torch.randn(1, 10)) note torch.nn 只支持小批量输入。整个 torch.nn 包都只支持小批量样本，而不支持单个样本。但是在调用前需要清除已存在的梯度，否则梯度将被累加到已存在的梯度。现在，我们将调用 loss.backward()，并查看 conv1 层的偏差（bias）项在反向传播前后的梯度。 net.zero_grad() 原文链接：pytorch 入门笔记 -03- 神经网络 print('conv1.bias.grad before backward') print(net.conv1.bias lr=0.01) # 迭代训练 optimizer.zero_grad() # 梯度清零 output = net(input) loss = criterion(output, target) # 计算损失 loss.backward() # 反向传播 optimizer.step() # 更新参数注意观察如何使用 optimizer.zero_grad() 手动将梯度缓冲区设置为零。原文链接：pytorch

0 码力 | 7 页 | 370.53 KB | 1 年前
3
动手学深度学习 v2.0

tensor([True, True, True, True]) 现在计算x的另一个函数。 70 2. 预备知识 # 在默认情况下，PyTorch会累积梯度，我们需要清除之前的值 x.grad.zero_() y = x.sum() y.backward() x.grad tensor([1., 1., 1., 1.]) 2.5.2 非标量变量的反向传播当y不是标量时，向量y关于向而是单独计算批量中每个样本的偏导数之和。 # 对非标量调用backward需要传入一个gradient参数，该参数指定微分函数关于self的梯度。 # 本例只想求偏导数的和，所以传递一个1的梯度是合适的 x.grad.zero_() y = x * x # 等价于y.backward(torch.ones(len(x))) y.sum().backward() x.grad tensor([0., 2., 4 中如何计算y的任何信息。换句话说，梯度不会向后流经u到x。因此，下面的反向传播函数计算z=u*x关于x的偏导数，同时将u作为常数处理，而不是z=x*x*x关于x的偏导数。 x.grad.zero_() y = x * x u = y.detach() (continues on next page) 2.5. 自动微分 71 (continued from previous page)

0 码力 | 797 页 | 29.45 MB | 1 年前
3
Experiment 1: Linear Regression

efficiency. In your program, scale both types of inputs by their standard deviations and set their means to zero. In Matlab/Octave, this can be executed with sigma = std (x ) ; mu = mean(x ) ; x ( : , 2 ) = ( x t J % t e c h n i c a l l y , the f i r s t J s t a r t s at the zero−eth i t e r a t i o n % but Matlab/Octave doesn ’ t have a zero index 5 figure ; plot ( 0 : 4 9 , J ( 1 : 5 0 ) , ’− ’ ) xlabel

0 码力 | 7 页 | 428.11 KB | 1 年前
3
Machine Learning Pytorch Tutorial

torch.optim.SGD(model.parameters(), lr, momentum = 0) ● For every batch of data: 1. Call optimizer.zero_grad() to reset gradients of model parameters. 2. Call loss.backward() to backpropagate gradients Loop for epoch in range(n_epochs): model.train() for x, y in tr_set: optimizer.zero_grad() x, y = x.to(device), y.to(device) pred = model(x) loss = criterion(pred step() iterate n_epochs set model to train mode iterate through the dataloader set gradient to zero move data to device (cpu/cuda) forward pass (compute output) compute loss compute gradient (backpropagation)

0 码力 | 48 页 | 584.86 KB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

sample is longer, it is truncated. A shorter sample is padded with null words. The null words have zero word vectors. We will further explain the process of transformation of a sentence to a word vector representation vector BATCH SIZE x MAX_SEQ_LEN x WORD2VEC_LEN with zero values vector = np.zeros(shape=(len(text), MAX_SEQ_LEN, WORD2VEC_LEN)) # Fill up zero vector with the actual word vectors from the language probabilities assigned to every class. Hence the soft-ness as compared to the hard labels which are one and zero for the correct and incorrect classes respectively. ● The predicted probability for the cat image

0 码力 | 56 页 | 18.93 MB | 1 年前
3

共 22 条前往

页

分类

语言

格式

AI大模型千问 qwen 中文文档

【PyTorch深度学习-龙龙老师】-测试版202112

《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniques

Lecture 6: Support Vector Machine

全连接神经网络实战. pytorch 版

pytorch 入门笔记-03- 神经网络

动手学深度学习 v2.0

Experiment 1: Linear Regression

Machine Learning Pytorch Tutorial

《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques