生活随笔
收集整理的這篇文章主要介紹了
逻辑回归案例
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
注:本案例為黑馬的課堂案例,上傳僅為方便查看
邏輯回歸案例
import pandas
as pd
import numpy
as np
from sklearn
.model_selection
import train_test_split
from sklearn
.preprocessing
import StandardScaler
from sklearn
.linear_model
import LogisticRegression
names
= ['Sample code number', 'Clump Thickness', 'Uniformity of Cell Size', 'Uniformity of Cell Shape','Marginal Adhesion', 'Single Epithelial Cell Size', 'Bare Nuclei', 'Bland Chromatin','Normal Nucleoli', 'Mitoses', 'Class']
data
= pd
.read_csv
('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data',names
=names
)
data
.head
()
Sample code numberClump ThicknessUniformity of Cell SizeUniformity of Cell ShapeMarginal AdhesionSingle Epithelial Cell SizeBare NucleiBland ChromatinNormal NucleoliMitosesClass
0| 1000025 | 5 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 | 2 |
1| 1002945 | 5 | 4 | 4 | 5 | 7 | 10 | 3 | 2 | 1 | 2 |
2| 1015425 | 3 | 1 | 1 | 1 | 2 | 2 | 3 | 1 | 1 | 2 |
3| 1016277 | 6 | 8 | 8 | 1 | 3 | 4 | 3 | 7 | 1 | 2 |
4| 1017023 | 4 | 1 | 1 | 3 | 2 | 1 | 3 | 1 | 1 | 2 |
data
= data
.replace
(to_replace
='?',value
=np
.nan
)
data
= data
.dropna
()
data
.describe
()
Sample code numberClump ThicknessUniformity of Cell SizeUniformity of Cell ShapeMarginal AdhesionSingle Epithelial Cell SizeBland ChromatinNormal NucleoliMitosesClass
count| 6.830000e+02 | 683.000000 | 683.000000 | 683.000000 | 683.000000 | 683.000000 | 683.000000 | 683.000000 | 683.000000 | 683.000000 |
mean| 1.076720e+06 | 4.442167 | 3.150805 | 3.215227 | 2.830161 | 3.234261 | 3.445095 | 2.869693 | 1.603221 | 2.699854 |
std| 6.206440e+05 | 2.820761 | 3.065145 | 2.988581 | 2.864562 | 2.223085 | 2.449697 | 3.052666 | 1.732674 | 0.954592 |
min| 6.337500e+04 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 |
25%| 8.776170e+05 | 2.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 2.000000 | 1.000000 | 1.000000 | 2.000000 |
50%| 1.171795e+06 | 4.000000 | 1.000000 | 1.000000 | 1.000000 | 2.000000 | 3.000000 | 1.000000 | 1.000000 | 2.000000 |
75%| 1.238705e+06 | 6.000000 | 5.000000 | 5.000000 | 4.000000 | 4.000000 | 5.000000 | 4.000000 | 1.000000 | 4.000000 |
max| 1.345435e+07 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 4.000000 |
data
.head
()
Sample code numberClump ThicknessUniformity of Cell SizeUniformity of Cell ShapeMarginal AdhesionSingle Epithelial Cell SizeBare NucleiBland ChromatinNormal NucleoliMitosesClass
0| 1000025 | 5 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 | 2 |
1| 1002945 | 5 | 4 | 4 | 5 | 7 | 10 | 3 | 2 | 1 | 2 |
2| 1015425 | 3 | 1 | 1 | 1 | 2 | 2 | 3 | 1 | 1 | 2 |
3| 1016277 | 6 | 8 | 8 | 1 | 3 | 4 | 3 | 7 | 1 | 2 |
4| 1017023 | 4 | 1 | 1 | 3 | 2 | 1 | 3 | 1 | 1 | 2 |
x
= data
.iloc
[:,1:-1]
x
.head
()
Clump ThicknessUniformity of Cell SizeUniformity of Cell ShapeMarginal AdhesionSingle Epithelial Cell SizeBare NucleiBland ChromatinNormal NucleoliMitoses
0| 5 | 1 | 1 | 1 | 2 | 1 | 3 | 1 | 1 |
1| 5 | 4 | 4 | 5 | 7 | 10 | 3 | 2 | 1 |
2| 3 | 1 | 1 | 1 | 2 | 2 | 3 | 1 | 1 |
3| 6 | 8 | 8 | 1 | 3 | 4 | 3 | 7 | 1 |
4| 4 | 1 | 1 | 3 | 2 | 1 | 3 | 1 | 1 |
y
= data
['Class']
y
.head
()
0 2
1 2
2 2
3 2
4 2
Name: Class, dtype: int64
x_train
,x_test
,y_train
,y_test
= train_test_split
(x
,y
,random_state
=2,test_size
=0.2)
transfer
= StandardScaler
()
x_train
= transfer
.fit_transform
(x_train
)
x_test
= transfer
.fit_transform
(x_test
)
estimator
= LogisticRegression
()
estimator
.fit
(x_train
,y_train
)
LogisticRegression()
y_pre
= estimator
.predict
(x_test
)
print('預測值是:\n',y_pre
)score
= estimator
.score
(x_test
,y_test
)
print('準確率是:\n',score
)
預測值是:[4 4 2 4 2 2 2 2 2 4 2 2 4 2 2 2 4 2 2 2 2 2 4 4 4 2 2 4 2 4 4 4 4 2 4 2 42 2 4 2 2 4 2 2 4 2 2 4 2 2 2 4 2 2 2 2 2 4 4 2 2 2 2 2 2 2 4 4 4 2 2 2 22 2 2 2 4 4 4 4 4 2 4 4 4 2 4 2 2 4 4 4 2 4 2 4 2 2 4 2 2 2 2 4 4 2 2 2 22 4 4 2 2 4 2 2 2 2 2 2 4 2 2 4 4 4 4 2 4 2 4 2 2 2]
準確率是:0.9343065693430657
總結
以上是生活随笔為你收集整理的逻辑回归案例的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。