《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquestechnique depends on several factors like customer preference, consumption delay, or resource availability (extra hands needed for chopping). Personally, I like full apples. Let’s move on from apples some information, or do not necessarily care about the loss in quality. Figure 2-2: On the left is a high quality image of a cat. The cat on the right is a lower quality compressed image. Source Both the converting high precision continuous values to low precision discrete values. Take a look at figure 2-3. It shows a sine wave and an overlapped quantized sine wave. The sine wave is continuous, a high precision0 码力 | 33 页 | 1.96 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionpredict the probability based on your behavior and currently trending content, the model will assign a high probability to Seinfeld. While there is no way of predicting with absolute certainty the exact content gradient (if any), and when there are a large number of layers the gradient essentially vanishes. Availability of labelled data Even if one has enough compute, and sophisticated algorithms, solving classical inference latency. Figure 1-8: An illustration of the quantization process: mapping of continuous high-precision values to discrete fixed-point integer values. Another example is Pruning (see Figure 1-9)0 码力 | 21 页 | 3.17 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesinput numerically. It must fulfill the following goals: a) To compress the information content of high-dimensional concepts such as text, image, audio, video, etc. to a low-dimensional representation 1 Dimensionality reduction is the process of transforming high-dimensional data into low-dimension, while retaining the properties from the high-dimensional representation. It is useful because it is often 1.0. We manually picked these values for illustration. Going through table 4-1, cat and dog have high values for the ‘cute’ feature, and low values for the ‘dangerous’ feature. On the other hand, a snake0 码力 | 53 页 | 3.92 MB | 1 年前3
Lecture 1: Overview/ 57 Example 2: Autonomous Driving-ALVINN A predecessor of Google car drives 70 mph on a public high- way 30x32 weights into one out of four hidden unit 30 outputs for steering 4 hidden units 30x32 57 Unsupervised Learning: Discovering Latent Factors Dimensionality reduction When dealing with high dimensional data, it is often useful to reduce the dimensionality by projecting the data to a lower the “essence” of the data. The motivation behind this technique is that although the data may appear high dimensional, there may only be a small number of degrees of variability, corresponding to latent factors0 码力 | 57 页 | 2.41 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquesthem communicate effectively with others who speak different languages. An application that employs a high quality model with a reasonable translation accuracy would garner better consumer support. In this this chapter, our focus will be on the techniques that enable us to achieve our quality goals. High quality models have an additional benefit in footprint constrained environments like mobile and edge devices distinct examples of objects (labels) you must show a child before they can learn to identify them with high accuracy. All cups have the same basic shape. One possible way to teach a child is to look at the0 码力 | 56 页 | 18.93 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewmodel for a new task: 1. Data Efficiency: It relies heavily on labeled data, and hence achieving a high performance on a new task requires a large number of labels. 2. Compute Efficiency: Training for this fine-tuning stage is not being used for learning rudimentary features, but rather how to map the high-level representations it learned in the pretraining stage to solving our new task. Thus, the number API10 to build their own applications. Given the large number of possible uses for such models, the high costs of pre-training get spread over the number of applications using it. Project: Using Pre-trained0 码力 | 31 页 | 4.03 MB | 1 年前3
Lecture Notes on Support Vector Machinefeature space can be expensive (e.g., we have to store all the high-dimensional images of the data samples and computing inner products in the high-dimensional feature space is of considerable overhead). Fortunately help of kernels, the mapping does not have to be explicitly computed, and computations in the new high-dimensional feature space remains efficient. 9 (a) (b) Figure 5: Feature mapping for 2-dimensional • Sigmoid Kernel K(x, z) = tanh(αxT + c) Overall, kernel K(x, z) represents a dot product in some high-dimensional feature space F K(x, z) = (xT z)2 or (1 + xT z)2 Any learning algorithm in which data0 码力 | 18 页 | 509.37 KB | 1 年前3
Lecture 3: Logistic Regressionfor pos. examples, and negative for neg. examples) High positive score: High probability of label 1 High negative score: Low probability of label 1 (high prob. of label 0) Feng Li (SDU) Logistic Regression0 码力 | 29 页 | 660.51 KB | 1 年前3
Lecture 6: Support Vector Machinewhen the new space is very high dimensional Storing and using these mappings in later computations can be expensive (e.g., we may have to compute inner products in a very high dimensional space) Using the with the mapped features remain efficient Feng Li (SDU) SVM December 28, 2021 46 / 82 Kernels as High Dimensional Feature Mapping Let’s assume we are given a function K (kernel) that takes as inputs Kernels can turn a linear model into a nonlinear one Kernel K(x, z) represents a dot product in some high dimensional fea- ture space F K(x, z) = (xTz)2 or (1 + xTz)2 Any learning algorithm in which examples0 码力 | 82 页 | 773.97 KB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationhyperparameters and . The blue contours mark the positive results while the red ones indicate the trials with high losses. The density of trials is identical in both the regions which indicates that the search doesn't The next section dives into the search for neural architectures. Neural Architecture Search On a high level, Neural Architecture Search (NAS) is similar to Hyperparameter Search. In both cases, we search based on the performance of the child network which tunes it to search for cells that result in a high performance child network. NASNet has a much refined search space because it is predicting fewer overall0 码力 | 33 页 | 2.48 MB | 1 年前3
共 27 条
- 1
- 2
- 3













