當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

支持向量机概念图解_支持向量机：基本概念

發(fā)布時(shí)間：2023/12/15 编程问答 20 豆豆

生活随笔收集整理的這篇文章主要介紹了支持向量机概念图解_支持向量机：基本概念小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

支持向量機(jī)概念圖解

One of the Dual purposes Supervised Machine Learning Algorithms, serves as both a Regression(Support Vector Regressor) and a Classification Algorithm(Support Vector Classifier). This article will primarily focus on how SVM works as a classifier.

雙重目的的監(jiān)督機(jī)器學(xué)習(xí)算法之一，既用作回歸(支持向量回歸器)，又用作分類算法(支持向量分類器)。本文將主要關(guān)注SVM作為分類器的工作方式。

The Support Vector Machine, also known as a ‘Support Vector Network’ is a Discriminative Machine Learning Classification Algorithm. It classifies data points into two classes at a time(this does not mean it is only a binary classifier, it only separates data points into two classes at a time), using a decision boundary(a hyperplane in this case). The primary objective of the Support vector Classifier is finding the ‘Optimal Separating Hyperplane(Decision Boundary)’.

支持向量機(jī)，也稱為“支持向量網(wǎng)絡(luò)”，是一種判別式機(jī)器學(xué)習(xí)分類算法。它使用決策邊界(在這種情況下為超平面 )一次將數(shù)據(jù)點(diǎn)分為兩類(這并不意味著它只是二進(jìn)制分類器，一次僅將數(shù)據(jù)點(diǎn)分為兩類)。支持向量分類器的主要目標(biāo)是找到“ 最佳分離超平面(決策邊界) ”。

I have an article that briefly explains the difference between Generative and Discriminative Algorithms. Find the link embedded here and at the end of the article as well.

我有一篇文章簡要解釋了生成算法和判別算法之間的區(qū)別。在此處以及文章末尾找到嵌入的鏈接。

The Support Vector Machine(SVM) as a classifier can conveniently perform tasks for both linearly separable and non-linearly separable data points, using its superpower(the kernel trick).

支持向量機(jī)(SVM)作為分類器，可以利用其超能力( 內(nèi)核技巧 )方便地對線性可分離和非線性可分離數(shù)據(jù)點(diǎn)執(zhí)行任務(wù)。

This unique algorithm was first introduced in the 1960s and later improved on in the 1990s by Vapnik et al. It was the first algorithm at that time to beat the Neural Network in the hand digits classification. It hence gained popularity and has been used for several other use cases. It finds applications in several areas such as:

這種獨(dú)特的算法最早是在1960年代引入的，后來在1990年代由Vapnik等人改進(jìn)。它是當(dāng)時(shí)在手?jǐn)?shù)字分類中擊敗神經(jīng)網(wǎng)絡(luò)的第一種算法。因此，它獲得了普及，并已用于其他幾個(gè)用例。它在以下幾個(gè)領(lǐng)域找到了應(yīng)用：

Face detection
人臉檢測
Bioinformatics(Cancer classification, diabetes prediction, protein classification et cetera)
生物信息學(xué)(癌癥分類，糖尿病預(yù)測，蛋白質(zhì)分類等)
Text and Hypertext classification
文本和超文本分類
Anomaly detection
異常檢測
For Clustering (Known as Support Vector Clustering(SVC))
用于聚類(稱為支持向量聚類(SVC))
Satellite image classification, et cetera.
衛(wèi)星圖像分類等。

So ‘Support Vector + Machine’ huh. What does it really mean? Let’s break down the basic terminologies.

因此，“ 支持向量 + 機(jī)器 ”是吧。到底是什么意思讓我們分解基本術(shù)語。

支持矢量機(jī)的基本術(shù)語 (SUPPORT VECTOR MACHINE BASIC TERMINOLOGY)

Vector: This is simply the training examples/data points. It is also known as ‘Feature vectors’ in machine learning.
向量：這只是訓(xùn)練示例/數(shù)據(jù)點(diǎn)。在機(jī)器學(xué)習(xí)中，它也被稱為“ 特征向量” 。
Support + Vector: This is simply a subset of the data closest to the hyperplane/decision boundary.
支持+向量 ：這只是最接近超平面/決策邊界的數(shù)據(jù)的子集。

Image Source: https://github.com/Rajat2712/SVM圖片來源： https : //github.com/Rajat2712/SVM

These vectors are intuitively called ‘Support vectors’ because they support the hyperplane/decision boundary and act as a pillar.

這些向量直觀地稱為“ 支持向量 ”，因?yàn)樗鼈冎С殖矫?決策邊界并充當(dāng)Struts。

The Hyperplane: In geometry, it is an n-dimensional generalization of a plane, a subspace with one less dimension(n-1) than its origin space. In one-dimensional space, it is a point, In two-dimensional space it is a line, In three-dimensional space, it is an ordinary plane, in four or more dimensional spaces, it is then called a ‘Hyperplane’. Take note of this, it is really how the Support Vector Machine works behind the scenes, the dimensions are the features represented in the data. For example, say we want to carry out a classification problem and we want to be able to tell if a product gets purchased or not(a binary classification), if there is just one feature(say Gender) available as a feature in the dataset, then it is in one-dimensional space and the subspace(the separating/decision boundary) representation will be (n-1=0) a 0-dimensional space(if I can call it that), represented with just a point showing the separation of classes(Purchased or not). If there are two features(Age and Gender), it is a two-dimensional space(2D), with either of Age and Gender on the X and y-axis, the decision boundary will be represented as a simple line. Similarly, if the features are three(Age, Gender, Income), the decision boundary will be a two-dimensional plane in a three-dimensional space(n-1). Furthermore, if we have a four or more dimensional space data points, then it is called a ‘Hyperplane’ with (n-1 dimension). It is important to note that the number of features for a given machine learning problem can be selected using a technique called ‘feature selection’, as not all features are necessarily useful, some can be redundant and create unnecessary noise in the data.
超平面 ：在幾何中，它是平面的n維概括，是一個(gè)子空間，其維數(shù)(n-1)比其原始空間小。在一維空間中，它是一個(gè)點(diǎn)；在二維空間中，它是一條線；在三維空間中，它是一個(gè)普通平面；在四個(gè)或更多維空間中，它被稱為“ 超平面 ”。注意這一點(diǎn)，實(shí)際上是支持向量機(jī)在后臺(tái)工作的方式，尺寸是數(shù)據(jù)中表示的特征。例如，假設(shè)我們要執(zhí)行分類問題，并且希望能夠判斷是否購買了某種產(chǎn)品(二進(jìn)制分類)，數(shù)據(jù)集中是否只有一個(gè)特征(例如性別)可用，則它位于一維空間中，子空間(分離/決策邊界)表示形式將為(n-1 = 0)0維空間(如果可以這樣稱呼)，僅用一個(gè)點(diǎn)表示班級分離(是否購買)。如果有兩個(gè)要素(年齡和性別)，則它是一個(gè)二維空間(2D)，X和y軸上分別有年齡和性別，決策邊界將用一條簡單的線表示。同樣，如果特征是三個(gè)(年齡，性別，收入)，則決策邊界將是三維空間(n-1)中的二維平面。此外，如果我們有一個(gè)四維或更多維的空間數(shù)據(jù)點(diǎn)，則它被稱為“ 超平面” (n-1維)。重要的是要注意，可以使用一種稱為“ 特征選擇 ”的技術(shù)來選擇給定機(jī)器學(xué)習(xí)問題的特征數(shù)量 ，因?yàn)椴⒎撬刑卣鞫家欢ㄓ杏?#xff0c;某些特征可能是多余的，并在數(shù)據(jù)中產(chǎn)生不必要的噪音。

DataDrivenDataDriven

The Hyperplane is simply a concept that separates an n-dimensional space into two groups/halves. In machine learning terms, it is a form of a decision boundary that algorithms like the Support Vector Machine uses to classify or separate data points. There are two parts to it, the negative side hyperplane and the positive part hyperplane, where data points/instances can lie on either part, signifying the group/class they belong to.

超平面只是一個(gè)將n維空間分為兩組/兩半的概念。用機(jī)器學(xué)習(xí)的術(shù)語來說，它是決策邊界的一種形式，支持向量機(jī)之類的算法用于對數(shù)據(jù)點(diǎn)進(jìn)行分類或分離。它有兩個(gè)部分，負(fù)側(cè)超平面和正側(cè)超平面，數(shù)據(jù)點(diǎn)/實(shí)例可以位于任一部分上，表示它們所屬的組/類。

here在這里

Margin: This is the distance between the decision boundary at both sides and the nearest/closest support vectors. It can also be defined as the shortest distance between the hyperplane and the support vectors with weight w, and bias b.
裕度：這是兩側(cè)決策邊界與最近/最近支持向量之間的距離。也可以將其定義為超平面與權(quán)重為w且偏置為b的支持向量之間的最短距離。

ResearchGateResearchGate

Linearly Separable Data points: Data points can be said to be linearly separable if a separating boundary/hyperplane can easily be drawn showing distinctively the different class groups. Linear separable data points mostly require linear machine learning classifiers such as Logistic regression for example.
線性可分離的數(shù)據(jù)點(diǎn) ：如果可以輕松地繪制出明顯代表不同類別組的分離邊界/超平面，則數(shù)據(jù)點(diǎn)可以說是線性可分離的。線性可分離數(shù)據(jù)點(diǎn)大多需要線性機(jī)器學(xué)習(xí)分類器，例如Logistic回歸。

Linearly Separable data points線性可分離數(shù)據(jù)點(diǎn)

Non-Linearly Separable data points: This is the exact opposite of Linearly separable data points. View the image below, notice that no matter how one tries to draw a straight line, some data points will one way or the other get misclassified. SVM has a special way of classifying this type of data. It uses Kernel functions to represent these data points in a higher-dimensional space and then finds the optimal separating hyperplane.
非線性可分離數(shù)據(jù)點(diǎn) ：這與線性可分離數(shù)據(jù)點(diǎn)完全相反。查看下面的圖像，請注意，無論如何嘗試?yán)L制一條直線，某些數(shù)據(jù)點(diǎn)都會(huì)以一種方式或另一種方式被錯(cuò)誤分類。 SVM具有對此類數(shù)據(jù)進(jìn)行分類的特殊方法。它使用核函數(shù)在較高維空間中表示這些數(shù)據(jù)點(diǎn)，然后找到最佳的分離超平面。

Non-Linearly Separable Datapoints非線性可分離數(shù)據(jù)點(diǎn)

Hard Margin: This is the type of margin used for linearly separable data points in the Support vector machine. Just as the name, ‘Hard Margin’, it is very rigid in classification, hence can result in overfitting. It works best when the data is linearly separable without outliers and a lot of noise.
硬邊界 ：這是用于支持向量機(jī)中線性可分離數(shù)據(jù)點(diǎn)的邊界類型。正如名稱“ Hard Margin”一樣，它在分類上非常嚴(yán)格，因此可能導(dǎo)致過擬合。當(dāng)數(shù)據(jù)可線性分離且沒有異常值和大量噪聲時(shí)，此方法效果最佳。
Soft Margin: This is the type of margin used for non-linearly separable data points. As the name literally implies, it is less rigid than the Hard-margin. It is robust to outliers and allows misclassifications. However, it can also result to underfitting in some cases.
軟裕度：這是用于非線性可分離數(shù)據(jù)點(diǎn)的裕度類型。顧名思義，它不如“硬邊距”那么硬。它對異常值具有魯棒性，并允許分類錯(cuò)誤。但是，在某些情況下也會(huì)導(dǎo)致擬合不足。

Hard versus Soft Margin硬保證金與軟保證金

Maximum Margin Hyperplane: As mentioned earlier, the primary objective of the Support vector Classifier is to find the optimal separating hyperplane, where data points are efficiently classified with fewer errors as possible. The Maximum Margin hyperplane is a hyperplane drawn such that it gives the ‘largest Margin’, which is used for classification. It is the simplest way to classify data if they are linear. The chosen hyperplane has the largest separating distance on both sides of the support vectors. Take a look at the image below, there are three separating hyperplanes drawn, while one does not accurately separate the data points(h1), the other two(h2 and h3) do. However, the maximum margin hyperplane in this example is h3.
最大余量超平面 ：如前所述，支持向量分類器的主要目標(biāo)是找到最佳的分離超平面，在該平面上以盡可能少的錯(cuò)誤對數(shù)據(jù)點(diǎn)進(jìn)行有效分類。最大邊距超平面是繪制的超平面，它給出了“最大邊距”，用于分類。如果數(shù)據(jù)是線性的，這是對數(shù)據(jù)進(jìn)行分類的最簡單方法。所選的超平面在支持向量的兩側(cè)具有最大的分隔距離。看一下下面的圖片，繪制了三個(gè)分離的超平面，而一個(gè)未正??確分離數(shù)據(jù)點(diǎn)(h1)，另外兩個(gè)(h2和h3)分離。但是，此示例中的最大余量超平面為h3。

embedded here嵌入在這里

Notice the difference in margin distance between h2 and h3.

請注意，h2和h3之間的邊距距離不同。

The Kernel Trick: Kernels or Kernel Functions are methods with which linear classifiers such as SVM use to classify non-linearly separable data points. This is done by representing the data points in a higher-dimensional space than its original. For example, a 1D data can be represented as a 2D data in space, a 2D data can be represented as a 3D data et cetera. So why is it called a ‘kernel trick’? SVM cleverly re-represents non-linear data points using any of the kernel functions in a way that it seems the data have been transformed, then finds the optimal separating hyperplane. However, in reality, the data points still remain the same, they have not actually been transformed. This is why it is called a ‘kernel Trick’.
內(nèi)核技巧 ：內(nèi)核或內(nèi)核函數(shù)是線性分類器(例如SVM)用于對非線性可分離數(shù)據(jù)點(diǎn)進(jìn)行分類的方法。這是通過在比原始數(shù)據(jù)維度更高的空間中表示數(shù)據(jù)點(diǎn)來完成的。例如，一維數(shù)據(jù)可以表示為空間中的2D數(shù)據(jù)，二維數(shù)據(jù)可以表示為3D數(shù)據(jù)等。那么為什么稱之為“內(nèi)核把戲”？ SVM使用任何內(nèi)核函數(shù)巧妙地重新表示非線性數(shù)據(jù)點(diǎn)，似乎數(shù)據(jù)已被轉(zhuǎn)換，然后找到最佳的分離超平面。但是，實(shí)際上，數(shù)據(jù)點(diǎn)仍然保持不變，實(shí)際上并沒有進(jìn)行轉(zhuǎn)換。這就是為什么它被稱為“ 內(nèi)核技巧 ”。

Great Learning很棒的學(xué)習(xí) 2D data represented as 3D2D數(shù)據(jù)表示為3D

A trick indeed. Don’t you agree?

確實(shí)是個(gè)把戲。 你不同意嗎？

The kernel trick offers a way to calculate relationships between data points using kernel functions, and represent the data in a more efficient way with less computation. Models that use this technique are called ‘kernelized models’.

內(nèi)核技巧提供了一種使用內(nèi)核函數(shù)來計(jì)算數(shù)據(jù)點(diǎn)之間關(guān)系的方法，并以更少的計(jì)算以更有效的方式表示數(shù)據(jù)。使用該技術(shù)的模型稱為“ 內(nèi)核化模型 ”。

There are several functions SVM uses to perform this task. Some of the most common ones are:

SVM使用多種功能來執(zhí)行此任務(wù)。一些最常見的是：

Polynomial Kernel Function: This transforms data points by using the dot product and transforming data to an ‘n-dimension’, n could be any value from 2, 3 et cetera, i.e the transformation will be either a squared product or higher. Therefore representing data in higher-dimensional space using the new transformed points.

多項(xiàng)式核函數(shù) ：這通過使用點(diǎn)積并將數(shù)據(jù)轉(zhuǎn)換為“ n維”來轉(zhuǎn)換數(shù)據(jù)點(diǎn)，n可以是2、3等的任何值，即轉(zhuǎn)換將是平方乘積或更高的乘積。因此，使用新的變換點(diǎn)在高維空間中表示數(shù)據(jù)。

The Radial Basis Function(RBF): This function behaves like a ‘weighted nearest neighbor model’. It transforms data by representing in infinite dimensions, then using the weighted nearest neighbor (observation with the most influence on the new data point) for classification. The Radial function can be either Gaussian or Laplace. This is dependent on a hyperparameter known as gamma. This is the most commonly used kernel.

徑向基函數(shù)(RBF) ：此函數(shù)的行為類似于“加權(quán)最近鄰模型”。它通過以無限維表示，然后使用加權(quán)的最近鄰居(對新數(shù)據(jù)點(diǎn)影響最大的觀測)進(jìn)行分類來轉(zhuǎn)換數(shù)據(jù)。徑向函數(shù)可以是高斯或拉普拉斯。這取決于稱為gamma的超參數(shù)。這是最常用的內(nèi)核。

The Sigmoid Function: also known as the hyperbolic tangent function(Tanh), finds more application in neural networks as an activation function. This function is used in image classification.

Sigmoid函數(shù) ：也稱為雙曲正切函數(shù)(Tanh)，作為激活函數(shù)在神經(jīng)網(wǎng)絡(luò)中有更多的應(yīng)用。此功能用于圖像分類。

The Linear Kernel: Used for linear data. This just simply represents the data points using a linear relationship.

線性內(nèi)核 ：用于線性數(shù)據(jù)。這只是使用線性關(guān)系簡單地表示數(shù)據(jù)點(diǎn)。

Kernel Functions in SVMSVM中的內(nèi)核功能

For Polynomial kernel, x and y represent the classes of observations, K represents the polynomial coefficient and p represents the degree of the polynomial. Both k and p are calculated using the cross-validation.

對于多項(xiàng)式核， x和y表示觀測類別， K表示多項(xiàng)式系數(shù)， p表示多項(xiàng)式的次數(shù)。使用交叉驗(yàn)證計(jì)算k和p 。

For Radial Basis kernel, the formula represented above is the Gaussian RBF, the representation is as follows:

對于Radial Basis內(nèi)核 ，上面表示的公式是Gaussian RBF，其表示如下：

The C-parameter: This is a regularization parameter used to prevent overfitting. It is inversely related to the Margin, such that if a larger C value is chosen, the margin is smaller, and if a smaller C value is chosen the margin is larger. It aids with the trade-off between bias and variance. SVM just like most machine learning algorithms has to deal with this as well.
C參數(shù) ：這是用于防止過度擬合的正則化參數(shù)。它與裕度成反比，因此，如果選擇較大的C值，則裕度較小；如果選擇較小的C值，則裕度較大。它有助于在偏差和方差之間進(jìn)行權(quán)衡。就像大多數(shù)機(jī)器學(xué)習(xí)算法一樣，SVM也必須處理這一問題。

結(jié)束注 (END NOTE)

The aim of this article was to explain some basic terminologies associated with the Support Vector Machine. Please reference the links below for further study.

本文的目的是解釋與支持向量機(jī)相關(guān)的一些基本術(shù)語。請參考下面的鏈接進(jìn)行進(jìn)一步研究。

Thank you for reading!

感謝您的閱讀！

連接社交媒體 (CONNECT ON SOCIAL MEDIA)

LinkedIn: www.linkedin.com/in/aminah-mardiyyah-rufa-i

領(lǐng)英(LinkedIn)： www.linkedin.com/in/aminah-mardiyyah-rufa-i

Twitter: @diyyah92

推特：@ diyyah92

REFERENCES AND RESOURCES

參考資料和資源

翻譯自: https://medium.com/swlh/the-support-vector-machine-basic-concept-a5106bd3cc5f

支持向量機(jī)概念圖解

總結(jié)

以上是生活随笔為你收集整理的支持向量机概念图解_支持向量机：基本概念的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。