 PyTorch Release NotescuBLAS 12.1.3.1 ‣ NVIDIA cuDNN 8.9.3 ‣ NVIDIA NCCL 2.18.3 ‣ NVIDIA RAPIDS™ 23.06 ‣ Apex ‣ rdma-core 39.0 ‣ NVIDIA HPC-X 2.15 ‣ OpenMPI 4.1.4+ ‣ GDRCopy 2.3 ‣ TensorBoard 2.9.0 ‣ Nsight Compute For more information about AMP, see the Training With Mixed Precision Guide. Tensor Core Examples The tensor core examples provided in GitHub and NGC focus on achieving the best performance and convergence paper. This model script is available on GitHub. ‣ TransformerXL model: This transformer-based language model has a segment-level recurrence and a novel relative positional encoding. The enhancements0 码力 | 365 页 | 2.94 MB | 1 年前3 PyTorch Release NotescuBLAS 12.1.3.1 ‣ NVIDIA cuDNN 8.9.3 ‣ NVIDIA NCCL 2.18.3 ‣ NVIDIA RAPIDS™ 23.06 ‣ Apex ‣ rdma-core 39.0 ‣ NVIDIA HPC-X 2.15 ‣ OpenMPI 4.1.4+ ‣ GDRCopy 2.3 ‣ TensorBoard 2.9.0 ‣ Nsight Compute For more information about AMP, see the Training With Mixed Precision Guide. Tensor Core Examples The tensor core examples provided in GitHub and NGC focus on achieving the best performance and convergence paper. This model script is available on GitHub. ‣ TransformerXL model: This transformer-based language model has a segment-level recurrence and a novel relative positional encoding. The enhancements0 码力 | 365 页 | 2.94 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning TechniquesModel quality is an important benchmark to evaluate the performance of a deep learning model. A language translation application that uses a low quality model would struggle with consumer adoption because create_model(): # Initialize the core model core_args = dict(input_shape=(IMG_SIZE, IMG_SIZE, 3), include_top=False) core = apps.resnet50.ResNet50(**core_args) core.trainable = False # Create the full full model with input, preprocessing, core and softmax layers. model = tf.keras.Sequential([ layers.Input([IMG_SIZE, IMG_SIZE, 3], dtype = tf.uint8), layers.Lambda(lambda x: tf.cast(x, tf.float32))0 码力 | 56 页 | 18.93 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning TechniquesModel quality is an important benchmark to evaluate the performance of a deep learning model. A language translation application that uses a low quality model would struggle with consumer adoption because create_model(): # Initialize the core model core_args = dict(input_shape=(IMG_SIZE, IMG_SIZE, 3), include_top=False) core = apps.resnet50.ResNet50(**core_args) core.trainable = False # Create the full full model with input, preprocessing, core and softmax layers. model = tf.keras.Sequential([ layers.Input([IMG_SIZE, IMG_SIZE, 3], dtype = tf.uint8), layers.Lambda(lambda x: tf.cast(x, tf.float32))0 码力 | 56 页 | 18.93 MB | 1 年前3
 AI大模型千问 qwen 中文文档Qwen Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Now the large language models have been upgraded to Qwen1.5. Both language models and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, apply_chat_template() to format your inputs as shown␣ �→below prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user"0 码力 | 56 页 | 835.78 KB | 1 年前3 AI大模型千问 qwen 中文文档Qwen Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Now the large language models have been upgraded to Qwen1.5. Both language models and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, apply_chat_template() to format your inputs as shown␣ �→below prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user"0 码力 | 56 页 | 835.78 KB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionestablish our motivation behind seeking efficiency in deep learning models. We will also introduce core areas of efficiency techniques (compression techniques, learning techniques, automation, efficient Learning models have beaten previous baselines significantly in many tasks in computer vision, natural language understanding, speech, and so on. Their rise can be attributed to a combination of things: Faster effect in the world of Natural Language Processing (NLP) (see Figure 1-2), where the Transformer architecture significantly beat previous benchmarks such as the General Language Understanding Evaluation (GLUE)0 码力 | 21 页 | 3.17 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionestablish our motivation behind seeking efficiency in deep learning models. We will also introduce core areas of efficiency techniques (compression techniques, learning techniques, automation, efficient Learning models have beaten previous baselines significantly in many tasks in computer vision, natural language understanding, speech, and so on. Their rise can be attributed to a combination of things: Faster effect in the world of Natural Language Processing (NLP) (see Figure 1-2), where the Transformer architecture significantly beat previous benchmarks such as the General Language Understanding Evaluation (GLUE)0 码力 | 21 页 | 3.17 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationdropout_rate=DROPOUT_RATE): # Initalize the core model core_args = dict(input_shape=(IMG_SIZE, IMG_SIZE, 3), include_top=False) core = apps.resnet50.ResNet50(**core_args) core.trainable = False # Setup the top Lambda(lambda x: tf.cast(x, tf.float32)), layers.Lambda(lambda x: apps.resnet.preprocess_input(x)), core, layers.Flatten(), layers.Dropout(dropout_rate), layers.Dense(NUM_CLASSES, activation='softmax') optimal neural architectures for image classification and language modeling. Their generated models exhibited strong performance on the image and language benchmark datasets. Moreover, their NAS model could0 码力 | 33 页 | 2.48 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationdropout_rate=DROPOUT_RATE): # Initalize the core model core_args = dict(input_shape=(IMG_SIZE, IMG_SIZE, 3), include_top=False) core = apps.resnet50.ResNet50(**core_args) core.trainable = False # Setup the top Lambda(lambda x: tf.cast(x, tf.float32)), layers.Lambda(lambda x: apps.resnet.preprocess_input(x)), core, layers.Flatten(), layers.Dropout(dropout_rate), layers.Dense(NUM_CLASSES, activation='softmax') optimal neural architectures for image classification and language modeling. Their generated models exhibited strong performance on the image and language benchmark datasets. Moreover, their NAS model could0 码力 | 33 页 | 2.48 MB | 1 年前3
 亚马逊AWSAI Services Overviewframe/sec with 640x480 resolution 处处可部署 Beyond BlindTool by Joseph Paul Cohen, demo on Nexus 4 Fit the core library with all dependencies into a single C++ source file Easy to compile on Departure Date Flight Booking “Book a flight to London” Automatic Speech Recognition Natural Language Understanding Book Flight London Utterances Flight booking London Heathrow Intent / Slot Departure Date Flight Booking “Book a flight to London” Automatic Speech Recognition Natural Language Understanding Book Flight London Utterances Flight booking London Heathrow Intent / Slot0 码力 | 56 页 | 4.97 MB | 1 年前3 亚马逊AWSAI Services Overviewframe/sec with 640x480 resolution 处处可部署 Beyond BlindTool by Joseph Paul Cohen, demo on Nexus 4 Fit the core library with all dependencies into a single C++ source file Easy to compile on Departure Date Flight Booking “Book a flight to London” Automatic Speech Recognition Natural Language Understanding Book Flight London Utterances Flight booking London Heathrow Intent / Slot Departure Date Flight Booking “Book a flight to London” Automatic Speech Recognition Natural Language Understanding Book Flight London Utterances Flight booking London Heathrow Intent / Slot0 码力 | 56 页 | 4.97 MB | 1 年前3
 keras tutorial............................................................................................ 18 Core Modules ......................................................................................... intelligence(AI), audio & video recognition and image recognition. Artificial neural network is the core of deep learning methodologies. Deep learning is supported by various libraries such as Theano, TensorFlow Architecture of Keras Keras API can be divided into three main categories:  Model  Layer  Core Modules In Keras, every ANN is represented by Keras Models. In turn, every Keras Model is composition0 码力 | 98 页 | 1.57 MB | 1 年前3 keras tutorial............................................................................................ 18 Core Modules ......................................................................................... intelligence(AI), audio & video recognition and image recognition. Artificial neural network is the core of deep learning methodologies. Deep learning is supported by various libraries such as Theano, TensorFlow Architecture of Keras Keras API can be divided into three main categories:  Model  Layer  Core Modules In Keras, every ANN is represented by Keras Models. In turn, every Keras Model is composition0 码力 | 98 页 | 1.57 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesthe hashing trick. It helps to reduce the vocabulary with little or no performance trade-off. The core idea of the hashing trick is as follows: 1. Choose the desired vocabulary size N, and the number equivalent). 16 Kaliamoorthi, P., Siddhant, A., Li, E., & Johnson, M. (2021). Distilling Large Language Models into Tiny and Effective Students using pQRNN. arXiv preprint arXiv:2101.08890. 15 Chung Fevry, T., Tsai, H., Johnson, M., & Ruder, S. (2020). Rethinking embedding coupling in pre-trained language models. arXiv preprint arXiv:2010.12821. A common solution for visual domains is to use a model0 码力 | 53 页 | 3.92 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architecturesthe hashing trick. It helps to reduce the vocabulary with little or no performance trade-off. The core idea of the hashing trick is as follows: 1. Choose the desired vocabulary size N, and the number equivalent). 16 Kaliamoorthi, P., Siddhant, A., Li, E., & Johnson, M. (2021). Distilling Large Language Models into Tiny and Effective Students using pQRNN. arXiv preprint arXiv:2101.08890. 15 Chung Fevry, T., Tsai, H., Johnson, M., & Ruder, S. (2020). Rethinking embedding coupling in pre-trained language models. arXiv preprint arXiv:2010.12821. A common solution for visual domains is to use a model0 码力 | 53 页 | 3.92 MB | 1 年前3
 PyTorch Brand Guidelinesintrigue and curiosity to our system. The symbol allows us to speak through a more graphic language — without resorting to cliché fire or data metaphors. 2 Brand Guidelines PyTorch Symbol Pantone 171 C Secondary Colors When designing content for the overall PyTorch brand, leverage these core palettes. These colors work successfully for print and digital communications. When using0 码力 | 12 页 | 34.16 MB | 1 年前3 PyTorch Brand Guidelinesintrigue and curiosity to our system. The symbol allows us to speak through a more graphic language — without resorting to cliché fire or data metaphors. 2 Brand Guidelines PyTorch Symbol Pantone 171 C Secondary Colors When designing content for the overall PyTorch brand, leverage these core palettes. These colors work successfully for print and digital communications. When using0 码力 | 12 页 | 34.16 MB | 1 年前3
 动手学深度学习 v2.0MF (Intel 80186) 1990 10 K (光学字符识别) 10 MB 10 MF (Intel 80486) 2000 10 M (网页) 100 MB 1 GF (Intel Core) 2010 10 G (广告) 1 GB 1 TF (Nvidia C2050) 2020 1 T (社交网络) 100 GB 1 PF (Nvidia DGX‐2) 很明显,随机存取存储 词或字符。假设长度为T的文本序列中的词元依次为x1, x2, . . . , xT 。于是,xt(1 ≤ t ≤ T)可以被认为是文 本序列在时间步t处的观测或标签。在给定这样的文本序列时,语言模型(language model)的目标是估计序 列的联合概率 P(x1, x2, . . . , xT ). (8.3.1) 例如,只需要一次抽取一个词元xt ∼ P(xt | xt−1, . . . , 们看一下如何使用循环神经网络来构建语言模型。设小批量大小为1,批量中的文本序列为“machine”。为 了简化后续部分的训练,我们考虑使用 字符级语言模型(character‐level language model),将文本词元化 为字符而不是单词。图8.4.2演示了如何通过基于字符级语言建模的循环神经网络,使用当前的和先前的字符 预测下一个字符。 图8.4.2: 基于循环神经网络的字0 码力 | 797 页 | 29.45 MB | 1 年前3 动手学深度学习 v2.0MF (Intel 80186) 1990 10 K (光学字符识别) 10 MB 10 MF (Intel 80486) 2000 10 M (网页) 100 MB 1 GF (Intel Core) 2010 10 G (广告) 1 GB 1 TF (Nvidia C2050) 2020 1 T (社交网络) 100 GB 1 PF (Nvidia DGX‐2) 很明显,随机存取存储 词或字符。假设长度为T的文本序列中的词元依次为x1, x2, . . . , xT 。于是,xt(1 ≤ t ≤ T)可以被认为是文 本序列在时间步t处的观测或标签。在给定这样的文本序列时,语言模型(language model)的目标是估计序 列的联合概率 P(x1, x2, . . . , xT ). (8.3.1) 例如,只需要一次抽取一个词元xt ∼ P(xt | xt−1, . . . , 们看一下如何使用循环神经网络来构建语言模型。设小批量大小为1,批量中的文本序列为“machine”。为 了简化后续部分的训练,我们考虑使用 字符级语言模型(character‐level language model),将文本词元化 为字符而不是单词。图8.4.2演示了如何通过基于字符级语言建模的循环神经网络,使用当前的和先前的字符 预测下一个字符。 图8.4.2: 基于循环神经网络的字0 码力 | 797 页 | 29.45 MB | 1 年前3
共 26 条
- 1
- 2
- 3













