《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationare just a small subset of the available techniques. It is often tedious to decide which ones would work for a problem even for experts. The simplest approach is to try and see which ones produce the best like learning rate, batch size or momentum are geared towards model convergence. However, they all work in conjunction to produce better models faster. Let's say that we are optimizing the validation loss than the model with 20% dropout rate and achieves a better accuracy as well. Table 7-2 shows a breakdown of trials for this run. Note that the bracket ids are in reverse order in contrast to the example0 码力 | 33 页 | 2.48 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquescontrast augmentation, color correction, hue augmentation, saturation, cutout, etc. Figure 3-7 shows a breakdown of the contributions of various transformations on the validation accuracy of a model trained on print(val_ds.as_numpy_iterator().next()[0].shape) (264, 264, 3) (264, 264, 3) Our dataset is ready. Let’s work on the model. We use a pre-trained ResNet50 model with the top (softmax) layer replaced with a new have multiple models which also multiplies our deployment costs. Hinton et al.18, in their seminal work explored how smaller student networks can be taught to extract “dark knowledge” from single or ensembles0 码力 | 56 页 | 18.93 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniqueslarge network could be safely removed with minimal performance deterioration. A random removal could work for removing a few weights. However, when pruning a large number of weights, say 60%, we risk the matrix multiplication anyway. Structured sparsity as the name suggests, incorporates some sort of structure into the process of pruning. One way to do this is through pruning blocks of weights together (block weights (the Train Weights step) until the model achieves its best performance. Frankle et al.’s work9 on the Lottery Ticket Hypothesis took a different look at pruning. They postulated that within every0 码力 | 34 页 | 3.18 MB | 1 年前3
深度学习下的图像视频处理技术-沈小勇[SIGGRAPH 17] • White-Box: [ACM TOG 18] • Distort-and-Recover: [CVPR 18] • DPE: [CVPR 18] Previous Work Input WVM [CVPR’16] JieP [ICCV’17] HDRNet [Siggraph’17] DPE [CVPR’18] White-Box [TOG’18] Distort-and-Recover DESR [Liao et al, 2015], VSRNet [Kappeler, et al, 2016], [Caballero et al, 2016], etc. Previous Work 38 Effectiveness How to make good use of multiple frames? Remaining Challenges 39 Data from Vid4 图像去模糊问题 75 Data from previous work Different Blur Assumptions Uniform: [Fergus et al, 2006], [Shan et al, 2009], [Cho et al, 2009], [Xu et al, 2010], etc. Previous Work 76 Data from [Xu et al, 2010]0 码力 | 121 页 | 37.75 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquesv/s -5.0)? If we can tolerate some loss of precision, can we use b-bits and save some space? Let us work on a scheme for going from this higher-precision domain (32-bits) to a quantized domain (b-bit values) Mapping from a high precision to a low precision domain. Visually inspecting figure 2-4, can you work out the formula for mapping a given floating-point value (x) to a quantized value (xq). Assume that given x. Logistics We just wanted to take a moment to state that in this book, we have chosen to work with Tensorflow 2.0 (TF) because it has exhaustive support for building and deploying efficient models0 码力 | 33 页 | 1.96 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewrelationships between inputs. In such pretext tasks, typically, the model pretends that a part/structure of the input is missing and it learns to predict the missing bit. It is similar to solving an almost project we will demonstrate that self-supervised models provide both those efficiency gains. We will work on the AGNews dataset (the same that we used in chapter 4) for text classification using a pre-trained Keras, and this guide for some optimizations to make it efficient. Also consider going through the work by Izsak et al.11 which presents a collection of tweaks to achieve BERT-like quality but with a budget0 码力 | 31 页 | 4.03 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient ArchitecturesEmbeddings form a crucial part of modern deep-learning models, and we are excited to explain how they work. In the following section we will explain them through a toy example, but feel free to jump ahead the high-dimensional representation. It is useful because it is often computationally infeasible to work with data that has a large number of features. However, not all features might be equally important on what we thought was reasonable. The purpose of this toy-example is to illustrate how embeddings work, and we encourage you to try and construct your own example to understand it better. represented0 码力 | 53 页 | 3.92 MB | 1 年前3
Lecture 1: Overviewthe mathematical theories behind the machine learning algorithm. Practice what you have learned. Work hard! Feng Li (SDU) Overview September 6, 2023 7 / 57 What is Machine Learning ? A computer program aspects of the data Examples: Discovering clusters Discovering latent factor Discovering graph structure Matrix completion Feng Li (SDU) Overview September 6, 2023 28 / 57 Unsupervised Learning: Discovering0 码力 | 57 页 | 2.41 MB | 1 年前3
PyTorch Tutorialmany layers as Torch. • It includes lot of loss functions. • It allows building networks whose structure is dependent on computation itself. • NLP: account for variable length sentences. Instead of padding the sentence’s length. PyTorch • Fundamental Concepts of PyTorch • Tensors • Autograd • Modular structure • Models / Layers • Datasets • Dataloader • Visualization Tools like • TensorboardX (monitor training)0 码力 | 38 页 | 4.09 MB | 1 年前3
keras tutoriallibraries but difficult to understand for creating neural networks. Keras is based on minimal structure that provides a clean and easy way to create deep learning models based on TensorFlow or Theano It supports the following features: Consistent, simple and extensible API. Minimal structure - easy to achieve the result without any frills. It supports multiple platforms and backends chapter. Introduction A Keras layer requires shape of the input (input_shape) to understand the structure of the input data, initializer to set the weight for each input and finally activators to transform0 码力 | 98 页 | 1.57 MB | 1 年前3
共 21 条
- 1
- 2
- 3













