PyTorch Release Notesuses modified model architecture hyperparameters, our modifications were made to achieve better hardware usage and to take advantage of Tensor Cores. This model script is available on GitHub. ‣ Jasper uses modified model architecture hyperparameters, our modifications were made to achieve better hardware usage and to take advantage of Tensor Cores. This model script is available on GitHub. ‣ Jasper uses modified model architecture hyperparameters, our modifications were made to achieve better hardware usage and to take advantage of Tensor Cores. This model script is available on GitHub. ‣ Jasper0 码力 | 365 页 | 2.94 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionon efficient deep learning to be categorized in roughly four core areas, with infrastructure and hardware forming the foundation (see Figure 1-7). 7 Lossy compression techniques allow you to compress data which comprises the core areas and relevant techniques as well as the foundation of infrastructure, hardware and tools.) Let us go over each of these areas individually. Compression Techniques These are Training & Inference stages, along with the constituent infrastructure components. Advances in hardware are significantly responsible for the deep learning revolution, specifically the GPU (Graphics Processing0 码力 | 21 页 | 3.17 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniqueshere. To get improvement in latency (inference or training), we also need to make sure that the hardware can exploit the sparsity in our network and skip some computation. Most of our computation in deep improvements by pruning connections such that there is a certain structure to the sparsity. This helps hardware implementations leverage that structure for faster inference. For instance, NVIDIA GPUs rely on is 50% sparse, the matrix multiplication can be performed in half the time. With the advent of hardware support for sparsity and many industrial and academic use cases reporting significant improvements0 码力 | 34 页 | 3.18 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquescomplexity. And since they are such a fundamental block of models, they have been optimized both in hardware and software. Let’s take a look at how we can optimize a slightly easier version of this operation require support from the underlying hardware. Moreover, multiplications and divisions are cheaper at lower precisions like b=8 and are well supported by the hardware technologies like the fixed-point SIMD0 码力 | 33 页 | 1.96 MB | 1 年前3
QCon北京2018-《从键盘输入到神经网络--深度学习在彭博的应用》-李碧野0/deed.en K80 © 2018 Bloomberg Finance L.P. All rights reserved. Back to 2018 Heterogeneous Hardware Modified from https://upload.wikimedia.org/wikipedia/commons/6/67/Kubernetes_logo.svg and https://commons0 码力 | 64 页 | 13.45 MB | 1 年前3
机器学习课程-温州大学-01深度学习-引言(Tensor Processing Units) Google Cloud TPU. https://cloud.google.com/tpu NVIDIA V100 TPU v2 TPU v3 Hardware Architecture NVIDIA Volta GPU Google Cloud TPU Google Cloud TPU Memory 16GB / 32GB 64GB 128GB0 码力 | 80 页 | 5.38 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewreceiving upgrades to your computer, except that these new upgrades don’t make it slower because your hardware is older, but actually make it perform better than it used to. How nifty! In some cases you can0 码力 | 31 页 | 4.03 MB | 1 年前3
共 7 条
- 1













