机器学习实战-KNN算法-20
生活随笔
收集整理的這篇文章主要介紹了
机器学习实战-KNN算法-20
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
機器學習實戰-KNN算法-鳶尾花分類
# 導入算法包以及數據集 from sklearn import neighbors from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report import random # 載入數據 iris = datasets.load_iris() print(iris) # 打亂數據切分數據集 # x_train,x_test,y_train,y_test = train_test_split(iris.data, iris.target, test_size=0.2) #分割數據0.2為測試數據,0.8為訓練數據#打亂數據 data_size = iris.data.shape[0] index = [i for i in range(data_size)] random.shuffle(index) iris.data = iris.data[index] iris.target = iris.target[index]#切分數據集 test_size = 40 x_train = iris.data[test_size:] x_test = iris.data[:test_size] y_train = iris.target[test_size:] y_test = iris.target[:test_size]# 構建模型 model = neighbors.KNeighborsClassifier(n_neighbors=3) model.fit(x_train, y_train) prediction = model.predict(x_test)print(classification_report(y_test, prediction))機器學習實戰-KNN算法-水果分類
from sklearn.neighbors import KNeighborsClassifier import numpy as np import pandas as pd from sklearn.preprocessing import LabelEncoder import matplotlib.pyplot as plt data = pd.read_csv('fruit_data.csv') data labelencoder = LabelEncoder() data.iloc[:,0] = labelencoder.fit_transform(data.iloc[:,0]) data labelencoder.classes_ from sklearn.model_selection import train_test_split # 切分數據集,stratify=y表示切分后訓練集和測試集中的數據類型的比例跟切分前y中的比例一致 # 比如切分前y中0和1的比例為1:2,切分后y_train和y_test中0和1的比例也都是1:2 # 設置random_state,使用同樣的隨機方式來切分數據 x_train,x_test,y_train,y_test = train_test_split(data.iloc[:,1:], data.iloc[:,0], test_size=0.3, stratify=data.iloc[:,0], random_state=20) # 保存不同k值測試集準確率 test_scores = [] # 保存不同k值訓練集準確率 train_scores = []# 設置30個k值 k = 30 for i in range(1,k):knn = KNeighborsClassifier(i)knn.fit(x_train,y_train)# 保存測試集準確率test_scores.append(knn.score(x_test,y_test))# 保存訓練集準確率train_scores.append(knn.score(x_train,y_train)) plt.title('k-NN Varying number of neighbors') plt.plot(range(1,k),test_scores,label="Test") plt.plot(range(1,k),train_scores,label="Train") plt.legend() plt.xticks(range(1,k)) plt.xlabel('k') plt.ylabel('accuracy') plt.show() # 選擇一個最好的k值作為模型參數 k = np.argmax(test_scores)+1 knn = KNeighborsClassifier(k) knn.fit(x_train,y_train) print(k) print(knn.score(x_test,y_test))總結
以上是生活随笔為你收集整理的机器学习实战-KNN算法-20的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 机器学习实战-逻辑回归-19
- 下一篇: 机器学习实战-神经网络-21