knn的python代码_KNN-python代码实现
KNN屬于機(jī)器學(xué)習(xí)中的監(jiān)督學(xué)習(xí),其核心思想即“物以類聚,人以群分”。
監(jiān)督學(xué)習(xí)算法的基本流程
1.歸一化數(shù)據(jù)樣本集
2.劃分樣本集為訓(xùn)練集和測試集
3、以訓(xùn)練集為算法參考系,測試集來測試算法
4、計(jì)算預(yù)測樣品標(biāo)簽和真實(shí)樣品標(biāo)簽的比值來評估算法的準(zhǔn)確率
5、調(diào)節(jié)不同的參數(shù)找到最優(yōu)算法參數(shù)
代碼實(shí)現(xiàn)
1.調(diào)用KNN函數(shù)來實(shí)現(xiàn)分類 (數(shù)據(jù)采用的是經(jīng)典的iris數(shù)據(jù),是三分類問題)
# 讀取相應(yīng)的庫
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
# 讀取數(shù)據(jù) X, y
iris = datasets.load_iris()
X = iris.data
y = iris.target
print (X, y)
# 把數(shù)據(jù)分成訓(xùn)練數(shù)據(jù)和測試數(shù)據(jù)(默認(rèn)25為測試數(shù)據(jù),75%為訓(xùn)練數(shù)據(jù))#random_state=2003為隨機(jī)數(shù)種子,作用參https://www.jianshu.com/p/4deb2cb2502f
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=2003)
# 構(gòu)建KNN模型, K值(n_neighbors)為3、 并做訓(xùn)練(fit)
clf = KNeighborsClassifier(n_neighbors=3)
clf.fit(X_train, y_train)
# 計(jì)算準(zhǔn)確率
from sklearn.metrics import accuracy_score
correct = np.count_nonzero((clf.predict(X_test)==y_test)==True)
print ("Accuracy is: %.3f" %(correct/len(X_test)))
#accuracy_score(y_test, clf.predict(X_test))
2.從零開始自己寫一個(gè)KNN算法
# 讀取相應(yīng)的庫,并導(dǎo)入數(shù)據(jù)
from sklearn import datasets
from collections import Counter # 為了做投票
from sklearn.model_selection import train_test_split
import numpy as np
# 導(dǎo)入iris數(shù)據(jù)
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=2003)
#歐式距離函數(shù) 與 分類函數(shù)
def euc_dis(instance1, instance2):
"""
計(jì)算兩個(gè)樣本instance1和instance2之間的歐式距離
instance1: 第一個(gè)樣本, array型
instance2: 第二個(gè)樣本, array型
"""
# TODO
dist = np.sqrt(sum((instance1 - instance2)**2))
return dist
def knn_classify(X, y, testInstance, k):
"""
給定一個(gè)測試數(shù)據(jù)testInstance, 通過KNN算法來預(yù)測它的標(biāo)簽。
X: 訓(xùn)練數(shù)據(jù)的特征
y: 訓(xùn)練數(shù)據(jù)的標(biāo)簽
testInstance: 測試數(shù)據(jù),這里假定一個(gè)測試數(shù)據(jù) array型
k: 選擇多少個(gè)neighbors?
"""
# TODO 返回testInstance的預(yù)測標(biāo)簽 = {0,1,2}
distances = [euc_dis(x, testInstance) for x in X]
kneighbors = np.argsort(distances)[:k]
count = Counter(y[kneighbors])
return count.most_common()[0][0]
#預(yù)測結(jié)果
predictions = [knn_classify(X_train, y_train, data, 3) for data in X_test]
correct = np.count_nonzero((predictions==y_test)==True)
#accuracy_score(y_test, clf.predict(X_test))
print ("Accuracy is: %.3f" %(correct/len(X_test)))
總結(jié)
以上是生活随笔為你收集整理的knn的python代码_KNN-python代码实现的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: pythongui选哪个方案好_谈谈py
- 下一篇: python re模块下载_python