programming language - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

efficient models capable of running on mobile and edge devices. We have also set up a couple of programming projects for a hands-on model optimization experience using these efficient layers and architectures equivalent). 16 Kaliamoorthi, P., Siddhant, A., Li, E., & Johnson, M. (2021). Distilling Large Language Models into Tiny and Effective Students using pQRNN. arXiv preprint arXiv:2101.08890. 15 Chung Fevry, T., Tsai, H., Johnson, M., & Ruder, S. (2020). Rethinking embedding coupling in pre-trained language models. arXiv preprint arXiv:2010.12821. A common solution for visual domains is to use a model

0 码力 | 53 页 | 3.92 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

Model quality is an important benchmark to evaluate the performance of a deep learning model. A language translation application that uses a low quality model would struggle with consumer adoption because follows right after. Following the lead from the previous chapters, the theory is complemented with programming projects to assist readers to implement these techniques from scratch. Our journey of learning sets up the modules, functions and variables that will be used later on. It initializes the Natural Language Toolkit (NLTK) and creates a text sequence from a sentence. from random import choice, randint

0 码力 | 56 页 | 18.93 MB | 1 年前
3
动手学深度学习 v2.0

总而言之，我们没有编写唤醒词识别器，而是编写了一个“学习”程序。如果我们用一个巨大的带标签的数据集，它很可能可以“学习”识别唤醒词。这种“通过用数据集来确定程序行为”的方法可以被看作用数据编程（programming with data）。比如，我们可以通过向机器学习系统，提供许多猫和狗的图片来设计一个 “猫图检测器”。检测器最终可以学会：如果输入是猫的图片就输出一个非常大的正数，如果输入是狗的图片词或字符。假设长度为T的文本序列中的词元依次为x1, x2, . . . , xT 。于是，xt（1 ≤ t ≤ T）可以被认为是文本序列在时间步t处的观测或标签。在给定这样的文本序列时，语言模型（language model）的目标是估计序列的联合概率 P(x1, x2, . . . , xT ). (8.3.1) 例如，只需要一次抽取一个词元xt ∼ P(xt | xt−1, . . . , 们看一下如何使用循环神经网络来构建语言模型。设小批量大小为1，批量中的文本序列为“machine”。为了简化后续部分的训练，我们考虑使用字符级语言模型（character‐level language model），将文本词元化为字符而不是单词。图8.4.2演示了如何通过基于字符级语言建模的循环神经网络，使用当前的和先前的字符预测下一个字符。图8.4.2: 基于循环神经网络的字

0 码力 | 797 页 | 29.45 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques

is a boiler-plate code. There is not much we can do to make it interesting. We are programming in the python language. Naturally, it is possible to use other languages (like Java for Android or C++ for

0 码力 | 33 页 | 1.96 MB | 1 年前
3
PyTorch Release Notes

paper. This model script is available on GitHub. ‣ TransformerXL model: This transformer-based language model has a segment-level recurrence and a novel relative positional encoding. The enhancements Transformers (BERT) is a new method of pretraining language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. This model is based on the the BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding paper. The NVIDIA BERT implementation is an optimized version of the Hugging Face implementation paper that leverages

0 码力 | 365 页 | 2.94 MB | 1 年前
3
Lecture 6: Support Vector Machine

equivalent to minimizing ∥ω∥2 = ωTω min ω,b ωTω s.t. y(i)(ωTx(i) + b) ≥ 1, ∀i This is a quadratic programming (QP) problem! Interior point method (https://en.wikipedia.org/wiki/Interior-point_method) Active problem, so the strong duality (p∗ = d∗) holds and the KKT conditions are respected Quadratic Programming problem in α Several off-the-shelf solvers exist to solve such QPs Some examples: quadprog (MATLAB)

0 码力 | 82 页 | 773.97 KB | 1 年前
3
Lecture Notes on Support Vector Machine

(x(i), y(i)), ωT x(i) +b ≥ 1 if y(i) = 1, and ωT x(i) + b ≤ 1 if y(i) = −1. This is a quadratic programming (QP) problem, and can be solved by exiting generic QP solvers, e.g., interior point method, active

0 码力 | 18 页 | 509.37 KB | 1 年前
3
AI大模型千问 qwen 中文文档

Qwen Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Now the large language models have been upgraded to Qwen1.5. Both language models and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, apply_chat_template() to format your inputs as shown␣ �→below prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user"

0 码力 | 56 页 | 835.78 KB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Review

chapter by presenting self-supervised learning which has been instrumental in the success of natural language models like BERT. Self-Supervised learning helps models to quickly achieve impressive quality with We will describe the general principles of Self-Supervised learning which are applicable to both language and vision. We will also demonstrate its efficacy through a colab. Finally, we introduce miscellaneous this works shortly. For now, let's assume that we have such a general model that works for natural language inputs. Then by definition the model should be able to encode the given text in a sequence of embeddings

0 码力 | 31 页 | 4.03 MB | 1 年前
3
《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

Learning models have beaten previous baselines significantly in many tasks in computer vision, natural language understanding, speech, and so on. Their rise can be attributed to a combination of things: Faster effect in the world of Natural Language Processing (NLP) (see Figure 1-2), where the Transformer architecture significantly beat previous benchmarks such as the General Language Understanding Evaluation (GLUE) Tom B., et al. "Language models are few-shot learners." arXiv preprint arXiv:2005.14165 (2020). 4 Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding

0 码力 | 21 页 | 3.17 MB | 1 年前
3

共 25 条前往

页

分类

语言

格式

《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

动手学深度学习 v2.0

《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques

PyTorch Release Notes

Lecture 6: Support Vector Machine

Lecture Notes on Support Vector Machine

AI大模型千问 qwen 中文文档

《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Review

《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction