LSTM-Layer使用vec] ▪ h/c: [num_layer, b, h] ▪ out: [seq, b, h] nn.LSTM nn.LSTMCell ▪ __init__ LSTMCell.forward() ▪ ht, ct = lstmcell(xt, [ht_1, ct_1]) ▪ xt: [b, vec] ▪ ht/ct: [b, h] Single layer Two Layers 下一课时0 码力 | 11 页 | 643.79 KB | 1 年前3
RNN-Layer使用RNN Layer使用 主讲人:龙良曲 Folded model feature ??@??ℎ + ℎ?@?ℎℎ [0,0,0 … ] x: ??? ???, ????ℎ, ??????? ??? ????ℎ, ??????? ??? @[ℎ????? ???, ??????? ???]?+ ????ℎ, ℎ????? ??? @ ℎ????? ???, ℎ????? ??? ? layers, b, h dim] ▪ out: [seq len, b, h dim] Single layer RNN feature ??@??ℎ 1 + ℎ? 1@?ℎℎ 1 [0,0,0 … ] ℎ? 1@??ℎ 2 + ℎ? 2@?ℎℎ 2 [0,0,0 … ] 2 layer RNN [T, b, h_dim], [layers, b, h_dim] nn.RNNCell0 码力 | 15 页 | 883.60 KB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturessaw earlier the points are linearly separable. We can train a model with a single fully connected layer followed by a softmax activation, since it is a binary classification task. An important caveat is fourth step, we train a model which trains the embedding table along with it. We use a single hidden layer network9 with a softmax classification head for this task. The size of the softmax classification apply in our case here. Step 1: Vocabulary Creation In this step, we will use a TextVectorization layer from Tensorflow to create a vocabulary of the most relevant words. It finds the top N words in a dataset0 码力 | 53 页 | 3.92 MB | 1 年前3
机器学习课程-温州大学-13深度学习-Transformer从编码器输入的句子首先会经过一个自注意力(self-attention)层,这层帮助编码器在对每 个单词编码时关注输入句子的其他单词。 自注意力层的输出会传递到前馈(feed-forward)神经网络中。每个位置的单词对应的前馈 神经网络都完全一样(译注:另一种解读就是一层窗口为一个单词的一维卷积神经网络)。 解码器中也有编码器的自注意力(self-attention)层和前馈(feed-forward)层。除此之外, Embedding,否则 Transformer 就是一个词袋模型了。 •Transformer 的重点是 Self-Attention 结构,其中用到的 Q, K, V矩阵通过输 出进行线性变换得到。 •Transformer 中 Multi-Head Attention 中有多个 Self-Attention,可以捕获单 词之间多种维度上的相关系数 attention 分数。 47 4.BERT0 码力 | 60 页 | 3.51 MB | 1 年前3
【PyTorch深度学习-龙龙老师】-测试版2021122164, -3.2164]]) 其中?和?张量均是矩阵,上述代码实现了一个线性变换的网络层,激活函数为空。一般 地,?(?@? + ?)网络层称为全连接层(Fully Connected Layer),在 PyTorch 中可以通过 Linear 类直接实现,特别地,当激活函数?为空时,全连接层也称为线性层。比如,通过 Linear 类创建输入 4 个节点,输出 3 个节点的网络层,并通过全连接层的 Processing,简称 NLP)中句子的表示,如评价句 子的是否为正面情绪的情感分类任务网络,如图 4.3 所示。为了能够方便字符串被神经网 络处理,一般将单词通过嵌入层(Embedding Layer)编码为固定长度的向量,比如“a”编码 为某个长度 3 的向量,那么 2 个等长(单词数量为 5)的句子序列可以表示为 shape 为[2,5,3] 的 3 维张量,其中 2 表示句子个数,5 表示单词数量,3 创建卷积神经网络 layer = nn.Conv2d(3, 16, kernel_size=3) out = layer(x) # 前向计算 out.shape # 输出大小 Out[48]: torch.Size([4, 16, 30, 30]) 其中卷积核张量?也是 4 维张量,可以通过 weight 成员变量访问: In [49]: layer.weight.shape0 码力 | 439 页 | 29.91 MB | 1 年前3
keras tutorial..................................................................................... 11 Multi-Layer Perceptron ...................................................................................... .............................................................. 17 Keras iv Layer ................................................................................................. ..................................................................................... 35 Dense Layer .................................................................................................0 码力 | 98 页 | 1.57 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationalgorithm and the training batch size. Other aspects of the training pipeline like data augmentation, layer and channel configurations can also be parameterized using hyperparameters. For example, when using additional parameters which could be searched as well. transformation parameters in data augmentation layer contribute to performance improvements while others like learning rate, batch size or momentum are Search In this exercise, we will train a model with a pair of hyperparameters: layer size and learning rate. The layer size determines the model size and the learning rate is used by the model optimizer0 码力 | 33 页 | 2.48 MB | 1 年前3
Keras: 基于 Python 的深度学习库4.2.3.10 predict_generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2.3.11 get_layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3 函数式 API . . . . . . . . . 4.3.3.10 predict_generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3.3.11 get_layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5 关于 Keras 网络层 58 5.1 关于 Keras 一个「节点」,将输入张量连接到输出张量。当多次调用同一个图层时,该图层将拥有多个节点 索引 (0, 1, 2…)。 在之前版本的 Keras 中,可以通过 layer.get_output() 来获得层实例的输出张量,或者通 过 layer.output_shape 来获取其输出形状。现在你依然可以这么做(除了 get_output() 已经 被 output 属性替代)。但是如果一个层与多个输入连接呢?0 码力 | 257 页 | 1.19 MB | 1 年前3
Jib Kubecon 2018 Talkbuild 100MB layer 50MB layer registry send github.com/GoogleContainerTools/jib Docker registry Set of layers, container configurations, and manifests build 100MB layer 50MB layer registry cached 100MB layer 40MB layer registry 9MB layer 1MB layer github.com/GoogleContainerTools/jib 1MB layer Docker registry Set of layers, container configurations, and manifests build 100MB layer 40MB 40MB layer registry 9MB layer send github.com/GoogleContainerTools/jib Jib does an optimized build like FROM gcr.io/distroless/java COPY target/dependencies /app/dependencies COPY target/resources0 码力 | 90 页 | 2.84 MB | 1 年前3
Machine Learning(Contd.) • The architecture of feedforward neural networks • Input layer, hidden layers (consisting of hidden units), and output layer 7 / 19 Neural Feedforward Networks (Contd.) • We approximate f ∗(x) ∗(x) by learning f(x) from the given training data • In the output layer, f(x) ≈ y for each training data, but the behavior of the other layers is not directly specified by the training data • Learning intermediate layers such that right results can be obtained in the output layer, but the training data do not say what each individual layer should do • The only thing we must provide to the neural network is0 码力 | 19 页 | 944.40 KB | 1 年前3
共 281 条
- 1
- 2
- 3
- 4
- 5
- 6
- 29













