Distillation - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

briefly introduced learning techniques such as regularization, dropout, data augmentation, and distillation to improve quality. These techniques can boost metrics like accuracy, precision, recall, etc. often are our primary quality concerns. We have chosen two of them, namely data augmentation and distillation, to discuss in this chapter. This is because, firstly, regularization and dropout are fairly straight-forward straight-forward to enable in any modern deep learning framework. Secondly, data augmentation and distillation can bring significant efficiency gains during the training phase, which is the focus of this chapter

0 码力 | 56 页 | 18.93 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Review

techniques. To recap, learning techniques can help us meet our model quality goals. Techniques like distillation and data augmentation improve the model quality, without increasing the footprint of the model training compute budget, so this approach is a non-starter. While techniques like data-augmentation, distillation etc. as introduced in chapter 3 do help us achieve better quality with fewer labels and fewer techniques like distillation might not be as helpful in certain settings. Subclass distillation in the next subsection can help us in some of these cases. Let’s find out how. Subclass Distillation It can also

0 码力 | 31 页 | 4.03 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

efficient model by trimming the number of parameters if needed. An example of a learning technique is Distillation (see Figure 1-10), which helps a smaller model (student) that can be deployed, to learn from a of probabilities for each of the possible classes according to the teacher model. Figure 1-10: Distillation of a smaller student model from a larger pre-trained teacher model. Both the teacher’s weights way. In the original paper which proposed distillation, Hinton et al. replicated performance of an ensemble of 10 models with one model when using distillation. For vision datasets like CIFAR-10, an accuracy

0 码力 | 21 页 | 3.17 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

model using data augmentation to achieve higher performance and subsequently apply compression or distillation to further reduce its footprint. With this chapter, we hope to have set the stage for your exploration your deep learning projects. They can often be combined with other approaches like quantization, distillation, data augmentation, that we already learned. In the next chapter we will explore some more advanced

0 码力 | 53 页 | 3.92 MB | 1 年前
3
动手学深度学习 v2.0

Socher, R. (2018). A closer look at deep learn‐ ing heuristics: learning rate restarts, warmup and distillation. arXiv preprint arXiv:1810.13243. [Graves, 2013] Graves, A. (2013). Generating sequences with

0 码力 | 797 页 | 29.45 MB | 1 年前
3

共 5 条前往

页

Efficient Deep Learning Book EDL Chapter Techniques Advanced Technical Review Introduction Architectures 动手深度学习 v2

分类

语言

格式

《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Review

《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

动手学深度学习 v2.0