 QCon北京2018-《未来都市--智慧城市与基于深度学习的机器视觉》-陈宇恒演讲者/陈宇恒 概要 • 我们是谁 • 智慧城市中机器视觉应用 • 我们是如何构建城市级AI+智慧城市系统 • 大规模深度学习实战系统的几点经验 l商汤科技联合创始人,架构师 lC++/Go/Rust/Ruby开发者 l多个开源项目贡献者 lNIPS国际会议论文作者 @chyh1990 2017.6 2016.3 2015.11 2014.6 2013.3 2011年中 scheduling Go语言在高性能系统中的实践经验 • 为什么用Go - 比起C++,更易于实践各种并发模式 - 比起Java,更加简洁,更易于与C/C++交互 - 比起脚本语言,类型和内存安全,保证重构效率与产品质量 - 完善的配套工具,如go test, gofmt, go lint, race-detector Go语言在高性能系统中的实践经验 • Go在开发高性能应用上也有一些不足, 部分标准库实现依赖reflect,性能较 差 - GC的带来的开销,如在Go Heap上 构建百万以上级别的对象缓存,需要 仔细优化 百倍慢于等价的C实现! 回顾 • 智慧城市中,在智能安防领域机器视觉有着爆发式应用 • 我们使用基于深度学习的机器视觉技术,构建了超大规模的自我演化 的分布式智能系统 • 在构建这个规模的系统中,我们广泛使用了Kubernetes、Go等流行技 术,“那些年踩过的坑”0 码力 | 23 页 | 9.26 MB | 1 年前3 QCon北京2018-《未来都市--智慧城市与基于深度学习的机器视觉》-陈宇恒演讲者/陈宇恒 概要 • 我们是谁 • 智慧城市中机器视觉应用 • 我们是如何构建城市级AI+智慧城市系统 • 大规模深度学习实战系统的几点经验 l商汤科技联合创始人,架构师 lC++/Go/Rust/Ruby开发者 l多个开源项目贡献者 lNIPS国际会议论文作者 @chyh1990 2017.6 2016.3 2015.11 2014.6 2013.3 2011年中 scheduling Go语言在高性能系统中的实践经验 • 为什么用Go - 比起C++,更易于实践各种并发模式 - 比起Java,更加简洁,更易于与C/C++交互 - 比起脚本语言,类型和内存安全,保证重构效率与产品质量 - 完善的配套工具,如go test, gofmt, go lint, race-detector Go语言在高性能系统中的实践经验 • Go在开发高性能应用上也有一些不足, 部分标准库实现依赖reflect,性能较 差 - GC的带来的开销,如在Go Heap上 构建百万以上级别的对象缓存,需要 仔细优化 百倍慢于等价的C实现! 回顾 • 智慧城市中,在智能安防领域机器视觉有着爆发式应用 • 我们使用基于深度学习的机器视觉技术,构建了超大规模的自我演化 的分布式智能系统 • 在构建这个规模的系统中,我们广泛使用了Kubernetes、Go等流行技 术,“那些年踩过的坑”0 码力 | 23 页 | 9.26 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewlabeled examples over a reasonable number of training epochs to do well on the given task. We will go into details of how this works shortly. For now, let's assume that we have such a general model that for pre-training. The pre-trained model is then used for fine-tuning for downstream tasks. Let’s go over both the stages, in detail. Pre-Training With Unlabeled Data The first question to answer is pick two random inputs from a diverse enough domain and they are likely to be dissimilar. How do we go about creating positive pairs? One example of such a recipe is the SimCLR framework12,13 (refer to0 码力 | 31 页 | 4.03 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Reviewlabeled examples over a reasonable number of training epochs to do well on the given task. We will go into details of how this works shortly. For now, let's assume that we have such a general model that for pre-training. The pre-trained model is then used for fine-tuning for downstream tasks. Let’s go over both the stages, in detail. Pre-Training With Unlabeled Data The first question to answer is pick two random inputs from a diverse enough domain and they are likely to be dissimilar. How do we go about creating positive pairs? One example of such a recipe is the SimCLR framework12,13 (refer to0 码力 | 31 页 | 4.03 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquesmore that you read, the more things you will know. The more that you learn, the more places you'll go.” ― Dr. Seuss Model quality is an important benchmark to evaluate the performance of a deep learning transformed image show_image(transformed_image.astype(int)) image_path = 'file:///whalefin.png' Now, let’s go through the various image transformations with code examples. Rotation rotates the image pixels around train_ds, val_ds = make_dataset('oxford_flowers102') The dataset contains variable sized samples. Go ahead and resize them to 264x264 size. This is a required step because our model expects fixed-sized0 码力 | 56 页 | 18.93 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniquesmore that you read, the more things you will know. The more that you learn, the more places you'll go.” ― Dr. Seuss Model quality is an important benchmark to evaluate the performance of a deep learning transformed image show_image(transformed_image.astype(int)) image_path = 'file:///whalefin.png' Now, let’s go through the various image transformations with code examples. Rotation rotates the image pixels around train_ds, val_ds = make_dataset('oxford_flowers102') The dataset contains variable sized samples. Go ahead and resize them to 264x264 size. This is a required step because our model expects fixed-sized0 码力 | 56 页 | 18.93 MB | 1 年前3
 keras tutorialchapter explains about how to install Keras on your machine. Before moving to installation, let us go through the basic requirements of Keras. Prerequisites You must satisfy the following requirements: a virtual environment while developing Python applications. Linux/Mac OS Linux or mac OS users, go to your project root directory and type the below command to create virtual environment, python3 9 This chapter explains Keras backend implementations TensorFlow and Theano in detail. Let us go through each implementation one by one. TensorFlow TensorFlow is an open source machine learning0 码力 | 98 页 | 1.57 MB | 1 年前3 keras tutorialchapter explains about how to install Keras on your machine. Before moving to installation, let us go through the basic requirements of Keras. Prerequisites You must satisfy the following requirements: a virtual environment while developing Python applications. Linux/Mac OS Linux or mac OS users, go to your project root directory and type the below command to create virtual environment, python3 9 This chapter explains Keras backend implementations TensorFlow and Theano in detail. Let us go through each implementation one by one. TensorFlow TensorFlow is an open source machine learning0 码力 | 98 页 | 1.57 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionand relevant techniques as well as the foundation of infrastructure, hardware and tools.) Let us go over each of these areas individually. Compression Techniques These are general techniques and algorithms that were more favorable. Source. As an extension to HPO, Neural Architecture Search (NAS) can help go beyond just learning hyper-parameters, and instead search for efficient architectures (layers, blocks where DL models beat the best humans as well as other computer bots in games like chess, shogi, and go. For the purpose of deployment in IoT and edge devices, both Google and NVidia have come up with accelerators0 码力 | 21 页 | 3.17 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 1 - Introductionand relevant techniques as well as the foundation of infrastructure, hardware and tools.) Let us go over each of these areas individually. Compression Techniques These are general techniques and algorithms that were more favorable. Source. As an extension to HPO, Neural Architecture Search (NAS) can help go beyond just learning hyper-parameters, and instead search for efficient architectures (layers, blocks where DL models beat the best humans as well as other computer bots in games like chess, shogi, and go. For the purpose of deployment in IoT and edge devices, both Google and NVidia have come up with accelerators0 码力 | 21 页 | 3.17 MB | 1 年前3
 Keras: 基于 Python 的深度学习库Recurrent 5.6.1 RNN [source] keras.layers.RNN(cell, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) 循环神经网络层基类。 参数 • cell: 一个 RNN 单元实例。RNN 单元是一个具有以下项目的类: 效的堆叠 RNN。 • return_sequences: 布尔值。是返回输出序列中的最后一个输出,还是全部序列。 • return_state: 布尔值。除了输出之外是否返回最后一个状态。 • go_backwards: 布尔值 (默认 False)。如果为 True,则向后处理输入序列并返回相反的序列。 • stateful: 布尔值 (默认 False)。如果为 True,则批次中索引 i bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) 完全连接的 RNN,其输出将被反馈到输入。 参数 • units: 正整数,输出空间的维度。 •0 码力 | 257 页 | 1.19 MB | 1 年前3 Keras: 基于 Python 的深度学习库Recurrent 5.6.1 RNN [source] keras.layers.RNN(cell, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) 循环神经网络层基类。 参数 • cell: 一个 RNN 单元实例。RNN 单元是一个具有以下项目的类: 效的堆叠 RNN。 • return_sequences: 布尔值。是返回输出序列中的最后一个输出,还是全部序列。 • return_state: 布尔值。除了输出之外是否返回最后一个状态。 • go_backwards: 布尔值 (默认 False)。如果为 True,则向后处理输入序列并返回相反的序列。 • stateful: 布尔值 (默认 False)。如果为 True,则批次中索引 i bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False) 完全连接的 RNN,其输出将被反馈到输入。 参数 • units: 正整数,输出空间的维度。 •0 码力 | 257 页 | 1.19 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquesstoring x. A b-bit unsigned integer will have 2b possible distinct values, ranging from 0 to 2b - 1. To go from a 32-bit floating point value to a b-bit integer, and back again, we need a mapping from one side bin 1, [xmin + 2s, xmin + 3s) will map to bin 2, and so on. Thus, to find which bin the given x will go to, we simply do the following: . We need the floor function ( ) so that the floating point value -5. -2.5 0. 2.5 5. 7.5 10. ] Now let’s quantize x. # Quantize the entire array in one go. x_q = quantize(x, -10.0, 10.0, 3) print(x_q) This returns the following result. [0 1 2 3 4 5 60 码力 | 33 页 | 1.96 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniquesstoring x. A b-bit unsigned integer will have 2b possible distinct values, ranging from 0 to 2b - 1. To go from a 32-bit floating point value to a b-bit integer, and back again, we need a mapping from one side bin 1, [xmin + 2s, xmin + 3s) will map to bin 2, and so on. Thus, to find which bin the given x will go to, we simply do the following: . We need the floor function ( ) so that the floating point value -5. -2.5 0. 2.5 5. 7.5 10. ] Now let’s quantize x. # Quantize the entire array in one go. x_q = quantize(x, -10.0, 10.0, 3) print(x_q) This returns the following result. [0 1 2 3 4 5 60 码力 | 33 页 | 1.96 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquesweights will have been removed. Now that we have presented a general algorithm for pruning, we should go over some examples of different ways we implement them. Concretely, a practitioner might want to experiment set. Our pruned model performed with an accuracy of 84.71%. It's a slight drop in performance. Let's go ahead and strip the pruning weights from the model that were added by the TFMOT library as shown below There is nothing novel going on there, so we are skipping listing the code here, but feel free to go through it in the Jupyter notebook. Regardless, this is the output we get from it. stats = comput0 码力 | 34 页 | 3.18 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniquesweights will have been removed. Now that we have presented a general algorithm for pruning, we should go over some examples of different ways we implement them. Concretely, a practitioner might want to experiment set. Our pruned model performed with an accuracy of 84.71%. It's a slight drop in performance. Let's go ahead and strip the pruning weights from the model that were added by the TFMOT library as shown below There is nothing novel going on there, so we are skipping listing the code here, but feel free to go through it in the Jupyter notebook. Regardless, this is the output we get from it. stats = comput0 码力 | 34 页 | 3.18 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient ArchitecturesHowever, owing to their incremental nature, they offer limited gains. Sometimes, it can be rewarding to go back to the drawing board and experiment with another architecture that better suits the task. As an reduction. We will explain these techniques in further detail in chapter 6. A Petting Zoo for Kids Let’s go back to our example of cute and dangerous animals, and represent each animal using two features, say there a way to automate the embedding table generation? Turns out there is! In the next section, let's go over a real world example of embedding table generation by leveraging deep learning to do the grunge0 码力 | 53 页 | 3.92 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient ArchitecturesHowever, owing to their incremental nature, they offer limited gains. Sometimes, it can be rewarding to go back to the drawing board and experiment with another architecture that better suits the task. As an reduction. We will explain these techniques in further detail in chapter 6. A Petting Zoo for Kids Let’s go back to our example of cute and dangerous animals, and represent each animal using two features, say there a way to automate the embedding table generation? Turns out there is! In the next section, let's go over a real world example of embedding table generation by leveraging deep learning to do the grunge0 码力 | 53 页 | 3.92 MB | 1 年前3
 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationactivation='softmax') ]) Our model, input data and the hyperparameter trial set is ready. Let's go ahead and train the model, each time choosing one item from the trial set. Each model is trained for halving, this is set to 2. For HyperBand, the recommended factor is 3. We will use the same. Now, let's go on and load the required modules and the dataset. import tensorflow as tf import tensorflow_datasets reader. The total number of epochs spent in the search are 144. That is about 1.5 training runs. If we go back and look at figure 7-X for BOS, it took 16 runs to converge to the optimum hyperparameters. However0 码力 | 33 页 | 2.48 MB | 1 年前3 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automationactivation='softmax') ]) Our model, input data and the hyperparameter trial set is ready. Let's go ahead and train the model, each time choosing one item from the trial set. Each model is trained for halving, this is set to 2. For HyperBand, the recommended factor is 3. We will use the same. Now, let's go on and load the required modules and the dataset. import tensorflow as tf import tensorflow_datasets reader. The total number of epochs spent in the search are 144. That is about 1.5 training runs. If we go back and look at figure 7-X for BOS, it took 16 runs to converge to the optimum hyperparameters. However0 码力 | 33 页 | 2.48 MB | 1 年前3
共 20 条
- 1
- 2













