Effektivität - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Lecture 5: Gaussian Discriminant Analysis, Naive Bayes

September 27, 2023 35 / 122 Warm Up (Contd.) Suppose we have n features X = [X1, X2, · · · , Xn]T The features are independent with each other P(X = x | Y = y) = P(X1 = x1, · · · , Xn = xn | Y = y) (2π)n/2|Σ|1/2 exp � −1 2(x − µ)TΣ−1(x − µ) � Mean vector µ ∈ Rn Covariance matrix Σ ∈ Rn×n Mahalanobis distance: r 2 = (x − µ)TΣ−1(x − µ) Σ is symmetric and positive semidefinite Σ = ΦΛΦT Φ is an orthonormal X given Y = 0 pX|Y =0(x) = 1 (2π)n/2|Σ|1/2 exp � −1 2(x − µ0)TΣ−1(x − µ0) � Or pX|Y (x | 0) = 1 (2π)n/2|Σ|1/2 exp � −1 2(x − µ0)TΣ−1(x − µ0) � Feng Li (SDU) GDA, NB and EM September 27, 2023 43

0 码力 | 122 页 | 1.35 MB | 1 年前
3
Lecture Notes on Gaussian Discriminant Analysis, Naive

example. Our aim is to identify if there is a cat in a given image. We assume X = [X1, X2, · · · , Xn]T is a random variable representing the feature vector of the given image, and Y ∈ {0, 1} is a random variable representing if there is a cat in the given image. Now, given an image x = [x1, x2, · · · , xn]T , out goal is to calculate P(Y = y | X = x) = P(X = x | Y = y)P(Y = y) P(X = x) (2) where y ∈ {0 probability density function (PDF) is defined as pX|Y (x | 0) = 1 (2π)n/2|Σ|1/2 exp � −1 2(x − µ0)T Σ−1(x − µ0) � (6) • A3: X | Y = 1 ∼ N(µ1, Σ): The conditional probability of continuous random variable

0 码力 | 19 页 | 238.80 KB | 1 年前
3
Lecture Notes on Support Vector Machine

is defined by ωT x + b = 0 (1) where ω ∈ Rn is the outward pointing normal vector, and b is the bias term. The n-dimensional space is separated into two half-spaces H+ = {x ∈ Rn | ωT x + b ≥ 0} and H− ∈ Rn | ωT x + b < 0} by the hyperplane, such that we can classify a given point x0 ∈ Rn according to sign(ωT x + b). Specifically, given a point x0 ∈ Rn, its label y is defined as y0 = sign(ωT x0 + b) b), i.e. y0 = � 1, ωT x0 + b ≥ 0 −1, otherwise (2) Given any x0 ∈ Rn, we can calculate the signed distance from x to the hyperplane as d0 = ωT x0 + b ∥ω∥ = � ω ∥ω∥ �T x0 + b ∥ω∥ (3) The sign of

0 码力 | 18 页 | 509.37 KB | 1 年前
3
机器学习课程-温州大学-11机器学习-降维

征，去掉冗余特征对机器学习的计算结果不会有影响。 10 1.降维概述数据可视化 t-distributed Stochastic Neighbor Embedding(t-SNE) t-SNE（TSNE）将数据点之间的相似度转换为概率。原始空间中的相似度由高斯联合概率表示，嵌入空间的相似度由“学生t分布”表示。虽然Isomap，LLE和variants等数据降维和可视化方法，更适合展开单个连展开单个连续的低维的manifold。但如果要准确的可视化样本间的相似度关系，如对于下图所示的S曲线（不同颜色的图像表示不同类别的数据），t-SNE表现更好。因为t-SNE主要是关注数据的局部结构。 11 1.降维概述降维的优缺点降维的优点： • 通过减少特征的维数，数据集存储所需的空间也相应减少，减少了特征维数所需的计算训练时间； • 数据集特征的降维有助于快速可视化数据；的矩阵，通过SVD是对矩阵进行分解，那么我们定义矩阵 ? 的 SVD 为： ? = ???T ? ?T ? ? ? × ? ? × ? ? × ? ? × ? ? ? 奇异值 · · 16 2.SVD(奇异值分解) 符号定义 ? = ???T = ?1?1?1 T + ⋯ + ??????T 其中?是一个? × ?的矩阵，每个特征向量??叫做? 的左奇异向量。 ?是一个?

0 码力 | 51 页 | 3.14 MB | 1 年前
3
动手学深度学习 v2.0

帮助、讨论这本书，并通过与作者和社区接触来找到问题的答案。 Discussions7 6 https://discuss.d2l.ai/ 7 https://discuss.d2l.ai/t/2086 8 目录安装我们需要配置一个环境来运行 Python、Jupyter Notebook、相关库以及运行本书所需的代码，以快速入门并获得动手学习经验。安装 Miniconda 行conda activate d2l以激活运行时环境。要退出环境，请运行conda deactivate。 Discussions10 10 https://discuss.d2l.ai/t/2083 目录 11 12 目录符号本书中使用的符号概述如下。数字 • x：标量 • x：向量 • X：矩阵 • X：张量 • I：单位矩阵 • xi, [x]i：向量x第i个元素的相关性 • H(X): 随机变量X的熵 • DKL(P∥Q): P和Q的KL‐散度复杂度 • O：大O标记 Discussions11 11 https://discuss.d2l.ai/t/2089 目录 15 16 目录 1 引言时至今日，人们常用的计算机程序几乎都是软件开发人员从零编写的。比如，现在开发人员要编写一个程序来管理网上商城。经过思考，开发人员可能提出如下

0 码力 | 797 页 | 29.45 MB | 1 年前
3
全连接神经网络实战. pytorch 版

之间的相乘，@ 和 .matmul 函数表示矩阵相乘；∗ 和 .mul 表示矩阵元素之间相乘： 6 Chapter 1. 准备章节 7 y = data_tensor @ data_tensor .T print (y) y = data_tensor ∗ data_tensor print (y) 输出分别是： [ [ 5 , 11] , [11 , 2 5 ] ] [ [ 5 , 功能，也就是说它可以依次将 batch_size 数量的样本导出。注意，前面已经导入过的 python 包我们就不再重复导入了。 from torch . u t i l s . data import Dataset from torch . u t i l s . data import DataLoader 前面说过，Dataset 可以存储自定义数据，我们可以继承 Dataset 类，在子类中实现一些固定 download=True , #如果根目录没有就下载 transform=ToTensor () ) #把数据显示一下 labels_map = { 0: ”T−Shirt ” , 1: ” Trouser ” , 2: ” Pullover ” , 3: ” Dress ” , 4: ”Coat” , 5: ” Sandal ” , 6: ” Shirt

0 码力 | 29 页 | 1.40 MB | 1 年前
3
Experiment 1: Linear Regression

Linux from Octave-Forge ). 2 Linear Regression Recall that the linear regression model is hθ(x) = θT x = n � j=0 θjxj, (1) where θ is the parameter which we need to optimize and x is the (n + 1)- dimensional every example. To do this in Matlab/Octave, the command is m = length (y ) ; % st or e the number of t r a i n i n g examples x = [ ones (m, 1) , x ] ; % Add a column of ones to x 2 From this point on implement linear regression for this problem. The linear regression model in this case is hθ(x) = θT x = 1 � i=0 θixi = θ1x1 + θ2, (4) (1) Implement gradient descent using a learning rate of α = 0.07

0 码力 | 7 页 | 428.11 KB | 1 年前
3
机器学习课程-温州大学-12机器学习-关联规则

令k=k+1，转入步骤2。 12 2.Apriori算法算法案例第一次迭代：假设支持度阈值为2，创建大小为1的项集并计算它们的支持度。订单编号项目 T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 项集支持度 {1} 3 {2} 3 {3} 4 {4} 1 {5} 4 C1 13 2.Apriori算法 F2 项集支持度 {1,3} 3 {1,5} 2 {2,3} 2 {2,5} 3 {3,5} 3 C2 订单编号项目 T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 15 2.Apriori算法算法案例项集支持度 {1,2} 1 {1,3} 3 {1,5} 2 {2 再次消除支持度小于2的项集。在这个例子中{1，2}。现在，让我们了解什么是剪枝，以及它如何使Apriori成为查找频繁项集的最佳算法之一。订单编号项目 T1 1 3 4 T2 2 3 5 T3 1 2 3 5 T4 2 5 T5 1 3 5 16 2.Apriori算法算法案例剪枝:我们将C3中的项集划分为子集，并消除支持值小于2的子集。项集在F2里？

0 码力 | 49 页 | 1.41 MB | 1 年前
3
机器学习课程-温州大学-09机器学习-支持向量机

任意超平面可以用下面这个线性方程来描述： ?T? + ? = 0 二维空间点 (?, ?)到直线 ?? + ?? + ? = 0的距离公式是： |?? + ?? + ?| ?2 + ?2 扩展到 ? 维空间后，点 ? = (?1, ?2 … ??) 到超平面 ?T? + ? = 0 的距离为： |?T?+?| ||?|| 其中 ||?|| = ?1 2 + ⋯ ??2 ?T? + ? = 0 ?T? + ? = = 1 ?T? + ? = −1 如图所示，根据支持向量的定义我们知道，支持向量到超平面的距离为 ?，其他点到超平面的距离大于 ?。每个支持向量到超平面的距离可以写为：? = |?T?+?| ||?|| 8 1.支持向量机概述背景知识 ?T? + ? = 0 ?T? + ? = 1 ?T? + ? = −1 ? = |?T? + ?| ||?|| 如图所示，根据支持向量的定义我们知道，支持向量到超平，其他点到超平面的距离大于 ?。于是我们有这样的一个公式：故：൞ ?T?+? ∥?∥ ≥ ? ? = 1 ?T?+? ∥?∥ ≤ −? ? = −1 我们暂且令?为 1（之所以令它等于 1，是为了方便推导和优化，且这样做对目标函数的优化没有影响），将两个方程合并，我们可以简写为：至此我们就可以得到最大间隔超平面的上下两个超平面： ?(?T? + ?) ≥ 1 9 2.线性可分支持向量机

0 码力 | 29 页 | 1.51 MB | 1 年前
3
Lecture 6: Support Vector Machine

The margin γ(i) is the signed distance between x(i) and the hyperplane ωT � x(i) − γ(i) ω ∥ω∥ � + b = 0 ⇒ γ(i) = � ω ∥ω∥ �T x(i) + b ∥ω∥ !" !" !"# + % = 0 ! !(#) !(#) = & & ' ((#) + * & Feng ∥ω∥ �T x(i) + b ∥ω∥ � !" !" !"# + % = 0 ! !(#) !(#) = & # ' ' ( )(#) + + ' Feng Li (SDU) SVM December 28, 2021 7 / 82 Margin (Contd.) Geometric margin γ(i) = y(i) �� ω ∥ω∥ �T x(i) + b Feng Li (SDU) SVM December 28, 2021 8 / 82 Margin (Contd.) Geometric margin γ(i) = y(i) � (ω/∥ω∥)T x(i) + b/∥ω∥ � Scaling (ω, b) does not change γ(i) With respect to the whole training set, the margin

0 码力 | 82 页 | 773.97 KB | 1 年前
3

共 67 条前往

页

分类

语言

格式

Lecture 5: Gaussian Discriminant Analysis, Naive Bayes

Lecture Notes on Gaussian Discriminant Analysis, Naive

Lecture Notes on Support Vector Machine

机器学习课程-温州大学-11机器学习-降维

动手学深度学习 v2.0

全连接神经网络实战. pytorch 版

Experiment 1: Linear Regression

机器学习课程-温州大学-12机器学习-关联规则

机器学习课程-温州大学-09机器学习-支持向量机

Lecture 6: Support Vector Machine