使用GridSearchCV和RandomizedSearchCV进行超参数调整
In Machine Learning, a hyperparameter is a parameter whose value is used to control the learning process.
在機器學習中, 超參數是一個參數,其值用于控制學習過程。
Hyperparameters can be classified as model hyperparameters, which cannot be inferred while fitting the machine to the training set because they refer to the model selection task, or algorithm hyperparameters, that in principle have no influence on the performance of the model but affect the speed and quality of the learning process. An example of a model hyperparameter is the topology and size of a neural network. Examples of algorithm hyperparameters are learning rate and mini-batch size.
可以將超參數歸類為模型超參數,它們無法在將機器安裝到訓練集上時進行推斷,因為它們是指模型選擇任務或算法超參數,它們原則上不會影響模型的性能,但會影響速度和學習過程的質量。 模型超參數的一個示例是神經網絡的拓撲和大小。 算法超參數的示例是學習率和小批量大小。
Different model training algorithms require different hyperparameters, some simple algorithms (such as ordinary least squares regression) require none. Given these hyperparameters, the training algorithm learns the parameters from the data. For instance, LASSO is an algorithm that adds a regularization hyperparameter to ordinary least squares regression, which has to be set before estimating the parameters through the training algorithm.
不同的模型訓練算法需要不同的超參數,一些簡單的算法(例如普通最小二乘回歸)則不需要。 給定這些超參數,訓練算法會從數據中學習參數。 例如,LASSO是一種將常規化超參數添加到普通最小二乘回歸的算法,必須在通過訓練算法估算參數之前進行設置。
A model hyperparameter is a configuration that is external to the model and whose value cannot be estimated from data.
模型超參數是模型外部的配置,無法從數據中估計其值。
- They are often used in processes to help estimate model parameters. 它們通常用于流程中以幫助估計模型參數。
- They are often specified by the practitioner. 它們通常由從業者指定。
- They can often be set using heuristics. 通常可以使用試探法來設置它們。
- They are often tuned for a given predictive modeling problem. 它們通常針對給定的預測建模問題進行調整。
We cannot know the best value for a model hyperparameter on a given problem. We may use rules of thumb, copy values used on other problems, or search for the best value by trial and error.
我們無法確定給定問題上模型超參數的最佳值。 我們可能會使用經驗法則,復制在其他問題上使用的值,或者通過反復試驗尋找最佳值。
When a machine learning algorithm is tuned for a specific problem, such as when you are using a grid search or a random search, then you are tuning the hyperparameters of the model or order to discover the parameters of the model that result in the most skilful predictions.
當針對特定問題調整了機器學習算法時(例如,當您使用網格搜索或隨機搜索時),則您正在調整模型的超參數或為了發現導致最熟練的模型參數而進行調整。預測。
Here we are going to use popular Iris flower dataset. So let’s import our dataset.
在這里,我們將使用流行的鳶尾花數據集。 因此,讓我們導入數據集。
Let’s take our first glance on out data.
讓我們首先看一下數據。
Code By Author作者代碼Before optimizing hyperparameters let’s import and train our model and see how much is the score.
在優化超參數之前,讓我們導入和訓練我們的模型,看看分數是多少。
Code By Author作者代碼The results are really nice but this won’t be the case for every dataset.
結果確實很好,但并非每個數據集都如此。
超參數調整 (Hyperparameter tuning)
In this blog we are going to tune 3 hyperparameters kernel, C and gamma.
在此博客中,我們將調整3個超參數內核C和gamma 。
核心 (Kernel)
kernel: {‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’}, default=’rbf’
內核: {'linear','poly','rbf','sigmoid','precomputed'},默認='rbf'
Linear Kernel:
線性內核:
Image by Author圖片作者The most basic way to use a SVC is with a linear kernel, which means the decision boundary is a straight line (or hyperplane in higher dimensions). Linear kernels are rarely used in practice, however I wanted to show it here since it is the most basic version of SVC. As can been seen below, it is not very good at classifying because the data is not linear.
使用SVC的最基本方法是使用線性核,這意味著決策邊界是一條直線(或更高維的超平面)。 線性內核實際上很少使用,但是我想在這里展示它,因為它是SVC的最基本版本。 如下所示,由于數據不是線性的,因此分類不是很好。
RBF Kernel:
RBF內核:
Radial Basis Function is a commonly used kernel in SVC:
徑向基函數是SVC中常用的內核:
Image by Author圖片作者where ||x?x′||2||x?x′||2 is the squared Euclidean distance between two data points xx and x′x′. If this doesn’t make sense, Sebastian’s book has a full description. However, for this tutorial, it is only important to know that an SVC classifier using an RBF kernel has two parameters: gamma and C.
其中|| x-x'|| 2 || x-x'|| 2是兩個數據點xx和x'x'之間的平方歐幾里德距離。 如果這沒有道理,則塞巴斯蒂安的書有完整的描述。 但是,對于本教程而言,重要的是要知道使用RBF內核的SVC分類器具有兩個參數: gamma和C
Image By Author圖片作者伽瑪 (Gamma)
Gamma is a parameter of the RBF kernel and can be thought of as the ‘spread’ of the kernel and therefore the decision region. When gamma is low, the ‘curve’ of the decision boundary is very low and thus the decision region is very broad. When gamma is high, the ‘curve’ of the decision boundary is high, which creates islands of decision-boundaries around data points. We will see this very clearly below.
Gamma是RBF內核的參數,可以被認為是內核的“擴展”,因此可以看作決策區域。 當伽瑪值較低時,決策邊界的“曲線”非常低,因此決策區域非常寬。 當伽瑪高時,決策邊界的“曲線”就高,這會在數據點周圍創建決策邊界的孤島。 我們將在下面非常清楚地看到這一點。
C (C)
C is a parameter of the SVC learner and is the penalty for misclassifying a data point. When C is small, the classifier is okay with misclassified data points (high bias, low variance). When C is large, the classifier is heavily penalized for misclassified data and therefore bends over backwards avoid any misclassified data points (low bias, high variance).
C是SVC學習者的參數,并且是錯誤分類數據點的代價。 當C小時,分類器可以處理分類錯誤的數據點(高偏差,低方差)。 當C大時,分類器會因錯誤分類的數據而受到嚴重懲罰,因此向后彎曲會避免任何錯誤分類的數據點(低偏差,高方差)。
Let’s try different parameters and calculate Cross-Validation Score.
讓我們嘗試不同的參數并計算交叉驗證得分。
First: Kernel → Linear ; C → 10 ; Gamma → auto.
首先:內核→線性; C→10; 伽瑪→自動。
Code By Author作者代碼Second: Kernel → RBF ; C → 10 ; Gamma → auto.
第二個:內核→RBF; C→10; 伽瑪→自動。
Code By Author作者代碼Third: Kernel → RBF ; C → 20 ; Gamma → auto.
第三位:內核→RBF; C→20; 伽瑪→自動。
Code By Author作者代碼If you take mean of all the scores, I think the first one did good from all of them. To check the score of each and every hyperparameter like this will take us a long time. Let’s do this same exact thing in a loop.
如果您將所有分數都作為平均值,我認為第一個分數對所有分數都有好處。 要像這樣檢查每個超參數的分數,將花費我們很長時間。 讓我們循環執行相同的操作。
Code By Author作者代碼From above results we can say that rbf with C=1 or 10 or linear with C=1 will give best performance.
根據以上結果,我們可以說C = 1或10的rbf或C = 1的線性的rbf將提供最佳性能。
GridSearchCV (GridSearchCV)
GridSearchCV does exactly same thing as for loop above but in a single line of code.
GridSearchCV與上面的for循環完全相同,但僅用一行代碼即可。
Code By Author作者代碼This output is so much to take in. But we can convert this data into Pandas DataFrame. Let’s see how
該輸出非常多。但是我們可以將這些數據轉換為Pandas DataFrame。 讓我們看看如何
Code By Author作者代碼We don’t want each and every information, we just want ‘param_C’, ’param_kernel’, ’mean_test_score’
我們不需要每一個信息,我們只需要'param_C','param_kernel','mean_test_score'
Code By Author作者代碼We can see out first 3 rows are performing best.
我們可以看到前三行的效果最佳。
GridSearchCV also gives many handy attributes to analyse our scores.
GridSearchCV還提供了許多方便的屬性來分析我們的分數。
Code By Author作者代碼best_params_ gives the best parameter for our model.
best_params_為模型提供最佳參數。
Code By Author作者代碼best_score_ gives highest score.
best_score_給出最高分。
There are many attributes u can get it from this command.
您可以通過此命令獲得許多屬性。
Code By Author作者代碼隨機搜索 (RandomizedSearchCV)
Use RandomizedSearchCV to reduce number of iterations and with random combination of parameters. This is useful when you have too many parameters to try and your training time is longer. It helps reduce the cost of computation.
使用RandomizedSearchCV減少迭代次數并使用參數的隨機組合。 當您要嘗試的參數太多并且訓練時間較長時,這很有用。 它有助于降低計算成本。
Code By Author作者代碼You can access the full source code here or click this link.
您可以在此處訪問完整的源代碼,或單擊此鏈接 。
Code By Author作者代碼翻譯自: https://medium.com/jovianml/hyperparameter-tuning-using-gridsearchcv-and-randomizedsearchcv-1777f42af54c
總結
以上是生活随笔為你收集整理的使用GridSearchCV和RandomizedSearchCV进行超参数调整的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: kusto使用_Python查找具有数据
- 下一篇: IBM Appscan基本操作手册