PyTorch Release Noteslanguage representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. This model is based on the BERT: Pre-training of Deep Bidirectional Transformers language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. This model is based on the BERT: Pre-training of Deep Bidirectional Transformers language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. This model is based on the BERT: Pre-training of Deep Bidirectional Transformers0 码力 | 365 页 | 2.94 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionnumber-crunching at the heart of deep learning. AlexNet1 was one of the earliest models to rely on Graphics Processing Units (GPUs) for training, which could 1 Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems 25 (2012): 1097-1105. do linear algebra operations such as multiplying two matrices together models over time. (Data Source) We have seen a similar effect in the world of Natural Language Processing (NLP) (see Figure 1-2), where the Transformer architecture significantly beat previous benchmarks0 码力 | 21 页 | 3.17 MB | 1 年前3
keras tutorialalgorithm, which will best fit for the type of learning process (e.g image classification, text processing, etc.,) and the available input data. Algorithm is represented by Model in Keras. Algorithm includes Text processing: Provides functions to convert text into NumPy array suitable for machine learning. We can use it in data preparation phase of machine learning. Image processing: Provides machine learning. We can use it in data preparation phase of machine learning. Sequence processing: Provides functions to generate time based data from the given input data. We can use it in data0 码力 | 98 页 | 1.57 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient ArchitecturesBig self-supervised models are strong semi-supervised learners. Advances in neural information processing systems, 33, 22243-22255. 17 A head is a trainable sub-network that takes in the output of the Network. The image on the left shows a recurrent cell processing the input sequence element at time step t. The image on the right explains the processing of the entire input sequence across n time steps. (2015). 22 Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017). Mathematically, we are given a pair of sequences and with shapes (n, d) and0 码力 | 53 页 | 3.92 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression TechniquesLeCun, Yann, John Denker, and Sara Solla. "Optimal brain damage." Advances in neural information processing systems 2 (1989). As you can deduce, the parameter changes the influence of the previous value "Deconstructing lottery tickets: Zeros, signs, and the supermask." Advances in neural information processing systems 32 (2019). 10 Liu, Zhuang, et al. "Rethinking the value of network pruning." arXiv preprint "Learning both weights and connections for efficient neural network." Advances in neural information processing systems 28 (2015). 7 Dettmers, Tim, and Luke Zettlemoyer. "Sparse networks from scratch: Faster0 码力 | 34 页 | 3.18 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquescompression technique that has been used across different parts of Computer Science especially in signal processing. It is a process of converting high precision continuous values to low precision discrete values 04711 (2016). 5 Hubara, Itay, et al. "Binarized neural networks." Advances in neural information processing systems 29 (2016). 4 Rastegari, Mohammad, et al. "Xnor-net: Imagenet classification using binary were used for. Figure 2-11: A visualization of 100 samples from the MNIST dataset. Loading and Processing the MNIST Dataset Before we start, the code is available as a Jupyter notebook here. Now let’s0 码力 | 33 页 | 1.96 MB | 1 年前3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquesperturbations." arXiv preprint arXiv:1903.12261 (2019). 11 Hendrycks, Dan, et al. "Augmix: A simple data processing method to improve robustness and uncertainty." arXiv preprint arXiv:1912.02781 (2019). Synthetic class 0. 17 Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems 27 (2014). 16 Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique as losses We will install the pydub dependency required by the tensorflow_datasets package for processing audio data, and load the speech_commands dataset from TFDS. !pip install pydub data_ds = tfds0 码力 | 56 页 | 18.93 MB | 1 年前3
李东亮:云端图像技术的深度学习模型与应用图像技术的三个核心难点>>小、快、准 小模型 线上速度快 预测准 Frequent remote upgrade CPU-constrained, real-time Cloud processing SACC2017 视觉感知模型 分割 Forward Block Forward Block deconvolution deconvolution convolution 图像技术的三个核心难点>>小、快、准 小模型 线上速度快 预测准 Frequent remote upgrade CPU-constrained, real-time Cloud processing SACC2017 图像技术的三个核心难点>>小、快、准 模型 数据 工程 模型缩减 结构演进 SACC2017 单尺度卷积核 多尺度卷积核 视觉感知的三个核心难点>>小、快、准0 码力 | 26 页 | 3.69 MB | 1 年前3
动手学深度学习 v2.0昂的许多线性代 数层传递数据。这也是为什么在20世纪90年代至21世纪初,优化凸目标的简单算法是研究人员的首选。然而, 用GPU训练神经网络改变了这一格局。图形处理器(Graphics Processing Unit,GPU)早年用来加速图形处 理,使电脑游戏玩家受益。GPU可优化高吞吐量的4 × 4矩阵和向量乘法,从而服务于基本的图形任务。幸运 的是,这些数学运算与卷积层的计算惊人地相似 优化gpu,甚至把它们作为通用GPU(general‐purpose GPUs,GPGPU)来销售。 那么GPU比CPU强在哪里呢? 首先,我们深度理解一下中央处理器(Central Processing Unit,CPU)的核心。CPU的每个核心都拥有高时 钟频率的运行能力,和高达数MB的三级缓存(L3Cache)。它们非常适合执行各种指令,具有分支预测器、深 层流水线和其他使CPU能 机的存储在数量和速度上都能根据用户需要进行动态分 配。建议用户在延迟太高时(例如,在训练期间存在许多小记录时)增加IOPs的配置数。 12.4.4 CPU 中央处理器(central processing unit,CPU)是任何计算机的核心。它们由许多关键组件组成:处理器核心 (processor cores)用于执行机器代码的;总线(bus)用于连接不同组件(注意,总线会因为处理器型号、0 码力 | 797 页 | 29.45 MB | 1 年前3
Machine Learning Pytorch Tutorialmodel.load_state_dict(ckpt) More About PyTorch ● torchaudio ○ speech/audio processing ● torchtext ○ natural language processing ● torchvision ○ computer vision ● skorch ○ scikit-learn + pyTorch More0 码力 | 48 页 | 584.86 KB | 1 年前3
共 21 条
- 1
- 2
- 3













