 PyTorch Release Notes/workspace/README.md for information about customizing your PyTorch image. For more information about PyTorch, including tutorials, documentation, and examples, see: ‣ PyTorch website ‣ PyTorch project This document document provides information about the key features, software enhancements and improvements, known issues, and how to run this container. PyTorch RN-08516-001_v23.07 | 2 Chapter 2. Pulling A Container access and can log in to the NGC container registry. Refer to NGC Getting Started Guide for more information. The deep learning frameworks, the NGC Docker containers, and the deep learning framework containers0 码力 | 365 页 | 2.94 MB | 1 年前3 PyTorch Release Notes/workspace/README.md for information about customizing your PyTorch image. For more information about PyTorch, including tutorials, documentation, and examples, see: ‣ PyTorch website ‣ PyTorch project This document document provides information about the key features, software enhancements and improvements, known issues, and how to run this container. PyTorch RN-08516-001_v23.07 | 2 Chapter 2. Pulling A Container access and can log in to the NGC container registry. Refer to NGC Getting Started Guide for more information. The deep learning frameworks, the NGC Docker containers, and the deep learning framework containers0 码力 | 365 页 | 2.94 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning TechniquesSuch a shift has two side-effects. First, a part of the image “falls off” the top edge. That information will be lost. And the second, the lower part of the image doesn’t have any pixel data because it Smaller, Faster, and Better." arXiv preprint arXiv:2106.08962 (2021). It’s time for a hands-on project to apply our recent learnings and measure their impact. We will use the oxford_flowers102 dataset performances with and without data augmentation to measure the benefits of the techniques we just learnt. Project: Oxford Flowers Classification The oxford_flowers102 dataset contains 1020 labeled examples each0 码力 | 56 页 | 18.93 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning TechniquesSuch a shift has two side-effects. First, a part of the image “falls off” the top edge. That information will be lost. And the second, the lower part of the image doesn’t have any pixel data because it Smaller, Faster, and Better." arXiv preprint arXiv:2106.08962 (2021). It’s time for a hands-on project to apply our recent learnings and measure their impact. We will use the oxford_flowers102 dataset performances with and without data augmentation to measure the benefits of the techniques we just learnt. Project: Oxford Flowers Classification The oxford_flowers102 dataset contains 1020 labeled examples each0 码力 | 56 页 | 18.93 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesrepresent aspects of an input numerically. It must fulfill the following goals: a) To compress the information content of high-dimensional concepts such as text, image, audio, video, etc. to a low-dimensional animals occupy the top-left area of the plot. Note how we have compressed the high-dimensional information about animals into just two dimensions, and established a relationship between them purely using after having read this in many textbooks throughout our life. Hopefully, we have given enough information to make this actually straightforward. as the number of features that our model learns for each0 码力 | 53 页 | 3.92 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesrepresent aspects of an input numerically. It must fulfill the following goals: a) To compress the information content of high-dimensional concepts such as text, image, audio, video, etc. to a low-dimensional animals occupy the top-left area of the plot. Note how we have compressed the high-dimensional information about animals into just two dimensions, and established a relationship between them purely using after having read this in many textbooks throughout our life. Hopefully, we have given enough information to make this actually straightforward. as the number of features that our model learns for each0 码力 | 53 页 | 3.92 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquesyour life materially. This is also how your brain naturally works. You can prune all this extra information and still drive well, eat, sleep etc. To memorize something for the long term, you need to improve pruning technique because of its simplicity and effectiveness. Later on in this chapter, we have a project that relies on it for sparsifying a deep learning model. The authors of the Optimal Brain Damage (2019). 1 LeCun, Yann, John Denker, and Sara Solla. "Optimal brain damage." Advances in neural information processing systems 2 (1989). As you can deduce, the parameter changes the influence of the previous0 码力 | 34 页 | 3.18 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquesyour life materially. This is also how your brain naturally works. You can prune all this extra information and still drive well, eat, sleep etc. To memorize something for the long term, you need to improve pruning technique because of its simplicity and effectiveness. Later on in this chapter, we have a project that relies on it for sparsifying a deep learning model. The authors of the Optimal Brain Damage (2019). 1 LeCun, Yann, John Denker, and Sara Solla. "Optimal brain damage." Advances in neural information processing systems 2 (1989). As you can deduce, the parameter changes the influence of the previous0 码力 | 34 页 | 3.18 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquesquantization section delves into the implementation details using code samples. We finish with a hands-on project that will walk you through the process of applying quantization in practical situations using popular in the history of computing, scientists have worked tirelessly towards storing and transmitting information in as few bits as possible. Depending on the use case, we might be interested in compressing in are losing some information as a trade off. It is especially applicable for multimedia (audio, video, images) data,, where it is likely that either humans who will consume the information will not notice0 码力 | 33 页 | 1.96 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquesquantization section delves into the implementation details using code samples. We finish with a hands-on project that will walk you through the process of applying quantization in practical situations using popular in the history of computing, scientists have worked tirelessly towards storing and transmitting information in as few bits as possible. Depending on the use case, we might be interested in compressing in are losing some information as a trade off. It is especially applicable for multimedia (audio, video, images) data,, where it is likely that either humans who will consume the information will not notice0 码力 | 33 页 | 1.96 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewsuch models, the high costs of pre-training get spread over the number of applications using it. Project: Using Pre-trained Language Models for News Classification That was a lot of talk without any code we need, and the number of training epochs we need to achieve our desired model quality. In this project we will demonstrate that self-supervised models provide both those efficiency gains. We will work quality and faster convergence than a BERT model that is trained from scratch. The code for this project is available here as a Jupyter notebook. We will not be explicitly demonstrating pre-training BERT0 码力 | 31 页 | 4.03 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewsuch models, the high costs of pre-training get spread over the number of applications using it. Project: Using Pre-trained Language Models for News Classification That was a lot of talk without any code we need, and the number of training epochs we need to achieve our desired model quality. In this project we will demonstrate that self-supervised models provide both those efficiency gains. We will work quality and faster convergence than a BERT model that is trained from scratch. The code for this project is available here as a Jupyter notebook. We will not be explicitly demonstrating pre-training BERT0 码力 | 31 页 | 4.03 MB | 1 年前3
 keras tutorial[MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> As of now the latest version is ‘3.7.2’. If Python is not installed, then visit the official environment while developing Python applications. Linux/Mac OS Linux or mac OS users, go to your project root directory and type the below command to create virtual environment, python3 -m venv kerasenv Keras 7 Quit virtual environment After finishing all your changes in your project, then simply run the below command to quit the environment: deactivate Anaconda Cloud We believe0 码力 | 98 页 | 1.57 MB | 1 年前3 keras tutorial[MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> As of now the latest version is ‘3.7.2’. If Python is not installed, then visit the official environment while developing Python applications. Linux/Mac OS Linux or mac OS users, go to your project root directory and type the below command to create virtual environment, python3 -m venv kerasenv Keras 7 Quit virtual environment After finishing all your changes in your project, then simply run the below command to quit the environment: deactivate Anaconda Cloud We believe0 码力 | 98 页 | 1.57 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationobjective function at different points in the search space, and then spawning trials based on the information gathered so far. The objective function is estimated through a surrogate function that is initialized oxford_flowers102 dataset. In the next section, we will retrain the same model but with a twist! Project: Oxford Flower Classification With Hyperparameter Tuning Recall that in chapter 3, we trained a dropout rate was 0.2. The model reached the top accuracy of 70% after training for 100 epochs. In this project, we will let the HyperBand choose the best values for these hyperparameters and see if we can do0 码力 | 33 页 | 2.48 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationobjective function at different points in the search space, and then spawning trials based on the information gathered so far. The objective function is estimated through a surrogate function that is initialized oxford_flowers102 dataset. In the next section, we will retrain the same model but with a twist! Project: Oxford Flower Classification With Hyperparameter Tuning Recall that in chapter 3, we trained a dropout rate was 0.2. The model reached the top accuracy of 70% after training for 100 epochs. In this project, we will let the HyperBand choose the best values for these hyperparameters and see if we can do0 码力 | 33 页 | 2.48 MB | 1 年前3
 AI大模型千问 qwen 中文文档creating and quantizing GGUF files in quantization/llama.cpp. You can refer to that document for more information. 1.4.4 PPL 评测 llama.cpp 为我们提供了评估 GGUF 模型 PPL 性能的方法。为了实现这一点,你需要准备一个数据集,比如 “wiki 测试”。这里我们展示了一个运行测试的例子。 kubernetes pip install "skypilot-nightly[aws,gcp]" 随后,您需要用如下命令确认是否能使用云: sky check For more information, check the official document and see if you have set up your cloud accounts correctly. Alternatively with DeepSpeed. # Check this issue https://github.com/huggingface/peft/issues/746 for more␣ �→information. if ( list(pathlib.Path(training_args.output_dir).glob("checkpoint-*")) and not training_args0 码力 | 56 页 | 835.78 KB | 1 年前3 AI大模型千问 qwen 中文文档creating and quantizing GGUF files in quantization/llama.cpp. You can refer to that document for more information. 1.4.4 PPL 评测 llama.cpp 为我们提供了评估 GGUF 模型 PPL 性能的方法。为了实现这一点,你需要准备一个数据集,比如 “wiki 测试”。这里我们展示了一个运行测试的例子。 kubernetes pip install "skypilot-nightly[aws,gcp]" 随后,您需要用如下命令确认是否能使用云: sky check For more information, check the official document and see if you have set up your cloud accounts correctly. Alternatively with DeepSpeed. # Check this issue https://github.com/huggingface/peft/issues/746 for more␣ �→information. if ( list(pathlib.Path(training_args.output_dir).glob("checkpoint-*")) and not training_args0 码力 | 56 页 | 835.78 KB | 1 年前3
 动手学深度学习 v2.0loss),它是分类问题最常用的损失之一。本节我们将通过介绍信息论基础来理解交叉熵损失。如果 想了解更多信息论的细节,请进一步参考 本书附录中关于信息论的一节52。 3.4.7 信息论基础 信息论(information theory)涉及编码、解码、发送以及尽可能简洁地处理信息或数据。 熵 信息论的核心思想是量化数据中的信息内容。在信息论中,该数值被称为分布P的熵(entropy)。可以通过 以下方程得到: learning/distributions.html 52 https://d2l.ai/chapter_appendix‐mathematics‐for‐deep‐learning/information‐theory.html 3.4. softmax回归 109 信息量 压缩与预测有什么关系呢?想象一下,我们有一个要压缩的数据流。如果我们很容易预测下一个数据,那么 这个数据就很容 大部分是没出现过的,因此一个模型如果只是简单地统计先前 “看到”的单词序列频率,那么模型面对这种问题肯定是表现不佳的。 101 https://en.wikipedia.org/wiki/Project_Gutenberg 304 8. 循环神经网络 8.3.2 马尔可夫模型与n元语法 在讨论包含深度学习的解决方案之前,我们需要了解更多的概念和术语。回想一下我们在 8.1节中对马尔可0 码力 | 797 页 | 29.45 MB | 1 年前3 动手学深度学习 v2.0loss),它是分类问题最常用的损失之一。本节我们将通过介绍信息论基础来理解交叉熵损失。如果 想了解更多信息论的细节,请进一步参考 本书附录中关于信息论的一节52。 3.4.7 信息论基础 信息论(information theory)涉及编码、解码、发送以及尽可能简洁地处理信息或数据。 熵 信息论的核心思想是量化数据中的信息内容。在信息论中,该数值被称为分布P的熵(entropy)。可以通过 以下方程得到: learning/distributions.html 52 https://d2l.ai/chapter_appendix‐mathematics‐for‐deep‐learning/information‐theory.html 3.4. softmax回归 109 信息量 压缩与预测有什么关系呢?想象一下,我们有一个要压缩的数据流。如果我们很容易预测下一个数据,那么 这个数据就很容 大部分是没出现过的,因此一个模型如果只是简单地统计先前 “看到”的单词序列频率,那么模型面对这种问题肯定是表现不佳的。 101 https://en.wikipedia.org/wiki/Project_Gutenberg 304 8. 循环神经网络 8.3.2 马尔可夫模型与n元语法 在讨论包含深度学习的解决方案之前,我们需要了解更多的概念和术语。回想一下我们在 8.1节中对马尔可0 码力 | 797 页 | 29.45 MB | 1 年前3
共 30 条
- 1
- 2
- 3













