4、简单的神经网络(MLP神经网络分类基础)
目錄
1、神經(jīng)網(wǎng)絡:(Artifical Neural Network)
2、MLP簡介
3、MLP方法
4、MLP簡單的二分類代碼案例
1、神經(jīng)網(wǎng)絡:(Artifical Neural Network)
全程為人工神經(jīng)網(wǎng)絡,是一種模仿生物神經(jīng)網(wǎng)絡(大腦)的結構和功能的數(shù)學模型或計算機模型
生物神經(jīng)細胞;
神經(jīng)細胞是構成神經(jīng)系統(tǒng)的基本單元,稱為生物神經(jīng)元,簡稱神經(jīng)元
# 簡單神經(jīng)網(wǎng)絡
S型函數(shù),單個神經(jīng)元。
最簡單的神經(jīng)網(wǎng)絡就是邏輯回歸
# 二進制是生物學和計算機學的鏈接
# 神經(jīng)網(wǎng)絡的隱藏層不是越多越好,隨著層數(shù)增多會達到一個閾值,再增加基本上不變,或者反而會減小。
2、MLP簡介
MLP常常被用來做分類,每個輸出對應一個不同的二進制分類(比如,垃圾郵件/正正常郵件,緊急/非緊急)
01 每個分類是在互斥的情況下愛,輸出層通常被修改成一個共享的soft-max函數(shù)。
例如:圖片數(shù)字的分類
3、MLP方法
sklearn.neural_network.MLPClassifier
MLPClassifier(solver=’sgd’, activation=’relu’,alpha=1e-4,hidden_layer_sizes=(50,50), random_state=1,max_iter=10,verbose=10,learning_rate_init=.1)參數(shù)說明:?
1. hidden_layer_sizes :例如hidden_layer_sizes=(50, 50),表示有兩層隱藏層,第一層隱藏層有50個神經(jīng)元,第二層也有50個神經(jīng)元。?
2. activation :激活函數(shù),{‘identity’, ‘logistic’, ‘tanh’, ‘relu’}, 默認relu?
- identity:f(x) = x?
- logistic:其實就是sigmod,f(x) = 1 / (1 + exp(-x)).?
- tanh:f(x) = tanh(x).?
- relu:f(x) = max(0, x)?
3. solver: {‘lbfgs’, ‘sgd’, ‘adam’}, 默認adam,用來優(yōu)化權重?
- lbfgs:quasi-Newton方法的優(yōu)化器?
- sgd:隨機梯度下降?
- adam: Kingma, Diederik, and Jimmy Ba提出的機遇隨機梯度的優(yōu)化器?
注意:默認solver ‘adam’在相對較大的數(shù)據(jù)集上效果比較好(幾千個樣本或者更多),對小數(shù)據(jù)集來說,lbfgs收斂更快效果也更好。?
4. alpha :float,可選的,默認0.0001,正則化項參數(shù)?
5. batch_size : int , 可選的,默認’auto’,隨機優(yōu)化的minibatches的大小batch_size=min(200,n_samples),如果solver是’lbfgs’,分類器將不使用minibatch?
6. learning_rate :學習率,用于權重更新,只有當solver為’sgd’時使用,{‘constant’,’invscaling’, ‘adaptive’},默認constant?
- ‘constant’: 有’learning_rate_init’給定的恒定學習率?
- ‘incscaling’:隨著時間t使用’power_t’的逆標度指數(shù)不斷降低學習率learning_rate_ ,effective_learning_rate = learning_rate_init / pow(t, power_t)?
- ‘adaptive’:只要訓練損耗在下降,就保持學習率為’learning_rate_init’不變,當連續(xù)兩次不能降低訓練損耗或驗證分數(shù)停止升高至少tol時,將當前學習率除以5.?
?
此方法詳細介紹見:https://blog.csdn.net/u011311291/article/details/78743393?
4、MLP簡單的二分類代碼案例
import pandasdata=pandas.read_csv('D:\\DATA\\pycase\\number2\\4.5\\Data.csv')# 1 數(shù)據(jù)質量分析(缺失值、異常值、一致性分析)基本描述、檢查空值explore=data.describe()# 此處為神經(jīng)網(wǎng)絡模型,數(shù)據(jù)量大,且有特征標簽不好插值。顧進行刪除data=data.dropna()data.shape# 2 數(shù)據(jù)變換 # 對離散特征進行虛擬變量處理 # 分開定義變量為后續(xù)預測做鋪墊,直接調用dummyColumns=['Gender', 'Home Ownership', 'Internet Connection', 'Marital Status','Movie Selector', 'Prerec Format', 'TV Signal'] # 將邏輯變量進行類型轉換for column in dummyColumns: # 遍歷選擇的列進行類型轉化data[column]=data[column].astype('category')dummiesData=pandas.get_dummies(data, # 要處理的Dataframecolumns=dummyColumns, # 要處理的列名,如果不指定該列,默認為處理所有的列 prefix_sep=" ", # 前綴和離散值的分隔符,默認為下劃線drop_first=True # 是否從備選項中刪除第一個,建模的時候為避免共線性使用)# 以性別為列,通過去重查看處理效果,查看某列屬性的防范,兩種,“。” 和[]dummiesData['Gender Male'].unique()data.Gender.unique()data['Gender'].unique()""" 博士后 post-Doc 博士 Doctorate 碩士 Master's Degree 學士 Bachelor's Degree 副學士 Associate's Degree 專業(yè)院校 Some College 職業(yè)學校 Trade School 高中 High School 小學 Grade School """# 有大小離散特征值的轉化educationLevelDict={'Post-Doc':9,'Doctorate':8,'Master\'s Degree':7,# 在這里需要轉義字符“\”'Bachelor\'s Degree': 6,'Associate\'s Degree': 5,'Some College': 4,'Trade School': 3,'High School': 2,'Grade School': 1} # 增加數(shù)值列dummiesData['Education Level Map']=dummiesData['Education Level'].map(educationLevelDict)# 同理其他的可測量數(shù)值變量freqMap = {'Never': 0,'Rarely': 1,'Monthly': 2,'Weekly': 3,'Daily': 4 } dummiesData['PPV Freq Map'] = dummiesData['PPV Freq'].map(freqMap) dummiesData['Theater Freq Map'] = dummiesData['Theater Freq'].map(freqMap) dummiesData['TV Movie Freq Map'] = dummiesData['TV Movie Freq'].map(freqMap) dummiesData['Prerec Buying Freq Map'] = dummiesData['Prerec Buying Freq'].map(freqMap) dummiesData['Prerec Renting Freq Map'] = dummiesData['Prerec Renting Freq'].map(freqMap) dummiesData['Prerec Viewing Freq Map'] = dummiesData['Prerec Viewing Freq'].map(freqMap)dummiesData.columns# 選取特征值dummiesSelect = ['Age', 'Num Bathrooms', 'Num Bedrooms', 'Num Cars', 'Num Children', 'Num TVs', 'Education Level Map', 'PPV Freq Map', 'Theater Freq Map', 'TV Movie Freq Map', 'Prerec Buying Freq Map', 'Prerec Renting Freq Map', 'Prerec Viewing Freq Map', 'Gender Male','Internet Connection DSL', 'Internet Connection Dial-Up', 'Internet Connection IDSN', 'Internet Connection No Internet Connection','Internet Connection Other', 'Marital Status Married', 'Marital Status Never Married', 'Marital Status Other', 'Marital Status Separated', 'Movie Selector Me', 'Movie Selector Other', 'Movie Selector Spouse/Partner', 'Prerec Format DVD', 'Prerec Format Laserdisk', 'Prerec Format Other', 'Prerec Format VHS', 'Prerec Format Video CD', 'TV Signal Analog antennae', 'TV Signal Cable', 'TV Signal Digital Satellite', 'TV Signal Don\'t watch TV' ]inputData = dummiesData[dummiesSelect]# 選取結果值,有空值,需要[]outputData=data.GenderoutputData=dummiesData['Home Ownership Rent']# 導入神經(jīng)網(wǎng)絡的方法 from sklearn.neural_network import MLPClassifierfor i in range(1,11): # 遍歷隱藏層從1到10ANNModel=MLPClassifier(activation='relu', # 激活函數(shù),類似S類型函數(shù)hidden_layer_sizes=i # 隱藏層越多,運算呈現(xiàn)幾何級倍增,不是越多越好)ANNModel.fit(inputData,outputData)score=ANNModel.score(inputData,outputData)print(str(i)+","+str(score)) # 輸出每個隱藏層對應的分數(shù)# 導入新數(shù)據(jù)進行訓練和測試newData=pandas.read_csv('D:\\DATA\\pycase\\number2\\4.4\\newData.csv') newData=newData.dropna()# 將邏輯變量轉化為虛擬變量,轉換方式利用樣本轉化類型for column in dummyColumns: newData[column]=newData[column].astype('category',categories=data[column].cat.categories)# 依據(jù)樣本字典進行新字段列的增加newData['Education Level Map'] = newData['Education Level'].map(educationLevelDict)newData['PPV Freq Map'] = newData['PPV Freq'].map(freqMap) newData['Theater Freq Map'] = newData['Theater Freq'].map(freqMap) newData['TV Movie Freq Map'] = newData['TV Movie Freq'].map(freqMap) newData['Prerec Buying Freq Map'] = newData['Prerec Buying Freq'].map(freqMap) newData['Prerec Renting Freq Map'] = newData['Prerec Renting Freq'].map(freqMap) newData['Prerec Viewing Freq Map'] = newData['Prerec Viewing Freq'].map(freqMap)dummiesNewData=pandas.get_dummies(newData,columns=dummyColumns,prefix=dummyColumns,# 列名的前綴,在多個列有相同離散項時候使用prefix_sep=" ",drop_first=True)inputNewData=dummiesNewData[dummiesSelect]ANNModel.predict(inputData)?
總結
以上是生活随笔為你收集整理的4、简单的神经网络(MLP神经网络分类基础)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: MongoDB实战-生产环境中分片的部署
- 下一篇: Linux 软中断机制分析