积分充值
 首页
前端开发
AngularDartElectronFlutterHTML/CSSJavaScriptReactSvelteTypeScriptVue.js构建工具
后端开发
.NetC#C++C语言DenoffmpegGoIdrisJavaJuliaKotlinLeanMakefilenimNode.jsPascalPHPPythonRISC-VRubyRustSwiftUML其它语言区块链开发测试微服务敏捷开发架构设计汇编语言
数据库
Apache DorisApache HBaseCassandraClickHouseFirebirdGreenplumMongoDBMySQLPieCloudDBPostgreSQLRedisSQLSQLiteTiDBVitess数据库中间件数据库工具数据库设计
系统运维
AndroidDevOpshttpdJenkinsLinuxPrometheusTraefikZabbix存储网络与安全
云计算&大数据
Apache APISIXApache FlinkApache KarafApache KyuubiApache OzonedaprDockerHadoopHarborIstioKubernetesOpenShiftPandasrancherRocketMQServerlessService MeshVirtualBoxVMWare云原生CNCF机器学习边缘计算
综合其他
BlenderGIMPKiCadKritaWeblate产品与服务人工智能亿图数据可视化版本控制笔试面试
文库资料
前端
AngularAnt DesignBabelBootstrapChart.jsCSS3EchartsElectronHighchartsHTML/CSSHTML5JavaScriptJerryScriptJestReactSassTypeScriptVue前端工具小程序
后端
.NETApacheC/C++C#CMakeCrystalDartDenoDjangoDubboErlangFastifyFlaskGinGoGoFrameGuzzleIrisJavaJuliaLispLLVMLuaMatplotlibMicronautnimNode.jsPerlPHPPythonQtRPCRubyRustR语言ScalaShellVlangwasmYewZephirZig算法
移动端
AndroidAPP工具FlutterFramework7HarmonyHippyIoniciOSkotlinNativeObject-CPWAReactSwiftuni-appWeex
数据库
ApacheArangoDBCassandraClickHouseCouchDBCrateDBDB2DocumentDBDorisDragonflyDBEdgeDBetcdFirebirdGaussDBGraphGreenPlumHStreamDBHugeGraphimmudbIndexedDBInfluxDBIoTDBKey-ValueKitDBLevelDBM3DBMatrixOneMilvusMongoDBMySQLNavicatNebulaNewSQLNoSQLOceanBaseOpenTSDBOracleOrientDBPostgreSQLPrestoDBQuestDBRedisRocksDBSequoiaDBServerSkytableSQLSQLiteTiDBTiKVTimescaleDBYugabyteDB关系型数据库数据库数据库ORM数据库中间件数据库工具时序数据库
云计算&大数据
ActiveMQAerakiAgentAlluxioAntreaApacheApache APISIXAPISIXBFEBitBookKeeperChaosChoerodonCiliumCloudStackConsulDaprDataEaseDC/OSDockerDrillDruidElasticJobElasticSearchEnvoyErdaFlinkFluentGrafanaHadoopHarborHelmHudiInLongKafkaKnativeKongKubeCubeKubeEdgeKubeflowKubeOperatorKubernetesKubeSphereKubeVelaKumaKylinLibcloudLinkerdLonghornMeiliSearchMeshNacosNATSOKDOpenOpenEBSOpenKruiseOpenPitrixOpenSearchOpenStackOpenTracingOzonePaddlePaddlePolicyPulsarPyTorchRainbondRancherRediSearchScikit-learnServerlessShardingSphereShenYuSparkStormSupersetXuperChainZadig云原生CNCF人工智能区块链数据挖掘机器学习深度学习算法工程边缘计算
UI&美工&设计
BlenderKritaSketchUI设计
网络&系统&运维
AnsibleApacheAWKCeleryCephCI/CDCurveDevOpsGoCDHAProxyIstioJenkinsJumpServerLinuxMacNginxOpenRestyPrometheusServertraefikTrafficUnixWindowsZabbixZipkin安全防护系统内核网络运维监控
综合其它
文章资讯
 上传文档  发布文章  登录账户
IT文库
  • 综合
  • 文档
  • 文章

无数据

分类

全部云计算&大数据(36)机器学习(36)

语言

全部英语(23)中文(简体)(13)

格式

全部PDF文档 PDF(36)
 
本次搜索耗时 0.040 秒,为您找到相关结果约 36 个.
  • 全部
  • 云计算&大数据
  • 机器学习
  • 全部
  • 英语
  • 中文(简体)
  • 全部
  • PDF文档 PDF
  • 默认排序
  • 最新排序
  • 页数排序
  • 大小排序
  • 全部时间
  • 最近一天
  • 最近一周
  • 最近一个月
  • 最近三个月
  • 最近半年
  • 最近一年
  • pdf文档 《Efficient Deep Learning Book》[EDL] Chapter 7 - Automation

    these choices are boolean, others have discrete parameters and still there are the ones with continuous parameters. Some choices even have multiple parameters. For example, horizontal flip is a boolean choice augment requires multiple parameters. Figure 7-1: The plethora of choices that we face when training a deep learning model in the computer vision domain. A Search Space for n parameters is a n-dimensional region such that a point in such a region is a set of well-defined values for each of those parameters. The parameters can take discrete or continuous values. It is called a "search" space because we are searching
    0 码力 | 33 页 | 2.48 MB | 1 年前
    3
  • pdf文档 《Efficient Deep Learning Book》[EDL] Chapter 1 - Introduction

    model scaled well with the number of labeled examples, since the network had a large number of parameters. Thus to extract the most out of the setup, the model needed a large number of labeled examples trailblazing work, there has been a race to create deeper networks with an ever larger number of parameters and increased complexity. In Computer Vision, several model architectures such as VGGNet, Inception intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011. Figure 1-2: Growth of parameters in Computer Vision and NLP models over time. (Data Source) We have seen a similar effect in the
    0 码力 | 21 页 | 3.17 MB | 1 年前
    3
  • pdf文档 Lecture 1: Overview

    to estimate parameters of it Use these parameters to make predictions for the test data. Such approaches save computation when we make predictions for test data. That is, estimate parameters once, use them remember all the training data. Linear regression, after getting parameters, can forget the training data, and just use the parameters. They are also opposite w.r.t. to statistical properties. NN makes ting into trouble. Optimization and Integration Usually involve finding the best values for some parameters (an opti- mization problem), or average over many plausible values (an integration problem). How
    0 码力 | 57 页 | 2.41 MB | 1 年前
    3
  • pdf文档 《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures

    often straightforward to scale up or down the model quality by increasing or decreasing these two parameters respectively. The exact sweet-spot of embedding table size and model quality needs to be determined vocabulary size, embedding dimension size, the initializing tensor for the embeddings and several other parameters. It crucially also supports fine-tuning the table to the task by setting the layer as trainable on-disk: We can use a smaller vocabulary, and see if the resulting quality is within the acceptable parameters. For on-device models, TFLite offers post-training quantization as described in chapter 2. We could
    0 码力 | 53 页 | 3.92 MB | 1 年前
    3
  • pdf文档 Lecture Notes on Gaussian Discriminant Analysis, Naive

    1 them share the same denominator P(X = x). Therefore, to perform Bayesian interference, the parameters we have to compute are only P(X = x | Y = y) and P(Y = y). Recalling that, in linear regression vector x and label y, while we now rely on Byes’ theorem to characterize the relationship through parameters θ = {P(X = x | Y = y), P(Y = y)}x,y. 2 Gaussian Discriminant Analysis In Gaussian Discriminate i=1 log pX|Y (x(i) | y(i); µ0, µ1, Σ) + m � i=1 log pY (y(i); ψ)(8) where ψ, µ0, and σ are parameters. Substituting Eq. (5)∼(7) into Eq. (8) gives 2 us a full expression of ℓ(ψ, µ0, µ1, Σ) ℓ(ψ,
    0 码力 | 19 页 | 238.80 KB | 1 年前
    3
  • pdf文档 《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques

    model footprint by reducing the number of trainable parameters. However, this approach has two drawbacks. First, it is hard to determine the parameters or layers that can be removed without significantly layers, and the number of parameters (assuming that the models are well-tuned). If we naively reduce the footprint, we can reduce the number of layers and number of parameters, but this could hurt the quality function with an input and parameters such that . In the case of a fully-connected layer, is a 2-D matrix. Further, assume that we can train another network with far fewer parameters ( ) such that the outputs
    0 码力 | 33 页 | 1.96 MB | 1 年前
    3
  • pdf文档 PyTorch Tutorial

    weights • Imagine updating 100k parameters! • An optimizer takes the parameters we want to update, the learning rate we want to use (and possibly many other hyper-parameters as well!) and performs the updates Two components • __init__(self): it defines the parts that make up the model —in our case, two parameters, a and b • forward(self, x): it performs the actual computation, that is, it outputs a prediction state_dic() - returns a dictionary of trainable parameters with their current values • model.parameters() - returns a list of all trainable parameters in the model • model.train() or model.eval() Putting
    0 码力 | 38 页 | 4.09 MB | 1 年前
    3
  • pdf文档 Machine Learning Pytorch Tutorial

    x2 x1 x3 x32 y2 y1 y3 y64 32 64 ... ... W (64x32) y x x = b + torch.nn – Network Parameters ● Linear Layer (Fully-connected Layer) >>> layer = torch.nn.Linear(32, 64) >>> layer.weight algorithms that adjust network parameters to reduce error. (See Adaptive Learning Rate lecture video) ● E.g. Stochastic Gradient Descent (SGD) torch.optim.SGD(model.parameters(), lr, momentum = 0) torch optimizer = torch.optim.SGD(model.parameters(), lr, momentum = 0) ● For every batch of data: 1. Call optimizer.zero_grad() to reset gradients of model parameters. 2. Call loss.backward() to backpropagate
    0 码力 | 48 页 | 584.86 KB | 1 年前
    3
  • pdf文档 《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques

    to the model performance. They are also likely to boost the performance of smaller models (fewer parameters / layers, etc.). Concretely, we want to find the smallest model, which when trained with the learning training process. The train() is simple. It takes the model, training set and validation set as parameters. It also has two hyperparameters: batch_size and epochs. We use a small batch size because our hard labels, and denotes the distillation loss function which uses the soft labels. and are hyper-parameters that weigh the two loss functions appropriately. When and , the student model is trained with
    0 码力 | 56 页 | 18.93 MB | 1 年前
    3
  • pdf文档 【PyTorch深度学习-龙龙老师】-测试版202112

    # 创建优化器,并传递需要优化的参数列表:[w1, b1, w2, b2, w3, b3] # 设置学习率 lr=0.001 optimizer = optim.SGD(model.parameters(), lr=0.01) train_loss = [] for epoch in range(5): # 训练 5 个 epoch for batch_idx, (x, 类的 parameters 函数来返回待优化参数列表,代码如下: In [5]: for p in fc.parameters(): print(p.shape) Out[5]: # 返回待优化参数列表 torch.Size([512, 784]) torch.Size([512]) 实际上,网络层除了保存了待优化张量列表 parameters,还有部分层包含了不参与梯度优 named_buffers 函数返回所有 不需要优化的参数列表。 除了通过 parameters 函数获得匿名的待优化张量列表外,还可以通过成员函数 named_parameters 获得待优化张量名和对象列表。例如: In [6]: # 返回所有参数列表 for name,p in fc.named_parameters(): print(name, p.shape) Out[6]:
    0 码力 | 439 页 | 29.91 MB | 1 年前
    3
共 36 条
  • 1
  • 2
  • 3
  • 4
前往
页
相关搜索词
EfficientDeepLearningBookEDLChapterAutomationIntroductionLectureOverviewArchitecturesNotesonGaussianDiscriminantAnalysisNaiveCompressionTechniquesPyTorchTutorialMachinePytorch深度学习
IT文库
关于我们 文库协议 联系我们 意见反馈 免责声明
本站文档数据由用户上传或本站整理自互联网,不以营利为目的,供所有人免费下载和学习使用。如侵犯您的权益,请联系我们进行删除。
IT文库 ©1024 - 2025 | 站点地图
Powered By MOREDOC AI v3.3.0-beta.70
  • 关注我们的公众号【刻舟求荐】,给您不一样的精彩
    关注我们的公众号【刻舟求荐】,给您不一样的精彩