Python之sklearn:LabelEncoder函数简介(编码与编码还原)、使用方法、具体案例之详细攻略
Python之sklearn:LabelEncoder函數(shù)簡介(編碼與編碼還原)、使用方法、具體案例之詳細(xì)攻略
?
?
目錄
LabelEncoder函數(shù)的簡介(編碼與編碼還原)
Methods
LabelEncoder函數(shù)的使用方法
LabelEncoder函數(shù)的具體案例
1、基礎(chǔ)案例
2、在數(shù)據(jù)缺失和test數(shù)據(jù)內(nèi)存在新值(train數(shù)據(jù)未出現(xiàn)過)環(huán)境下的數(shù)據(jù)LabelEncoder化
?
LabelEncoder函數(shù)的簡介(編碼與編碼還原)
| class LabelEncoder Found at: sklearn.preprocessing._labelclass LabelEncoder(TransformerMixin, BaseEstimator): | ? ""對目標(biāo)標(biāo)簽進(jìn)行編碼,值在0到n_class -1之間。 這個(gè)轉(zhuǎn)換器應(yīng)該用于編碼目標(biāo)值,*即' y ',而不是輸入' X '。 更多內(nèi)容見:ref: ' User Guide '。 |
| ? ? .. versionadded:: 0.12 ? ?? ? ? Attributes ? ? ---------- ? ? classes_ : array of shape (n_class,) ? ? Holds the label for each class. ? ?? ? ? Examples ? ? -------- ? ? `LabelEncoder` can be used to normalize labels. ? ?? ? ? >>> from sklearn import preprocessing ? ? >>> le = preprocessing.LabelEncoder() ? ? >>> le.fit([1, 2, 2, 6]) ? ? LabelEncoder() ? ? >>> le.classes_ ? ? array([1, 2, 6]) ? ? >>> le.transform([1, 1, 2, 6]) ? ? array([0, 0, 1, 2]...) ? ? >>> le.inverse_transform([0, 0, 1, 2]) ? ? array([1, 1, 2, 6]) ? ?? ? ? It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels. ? ?? ? ? >>> le = preprocessing.LabelEncoder() ? ? >>> le.fit(["paris", "paris", "tokyo", "amsterdam"]) ? ? LabelEncoder() ? ? >>> list(le.classes_) ? ? ['amsterdam', 'paris', 'tokyo'] ? ? >>> le.transform(["tokyo", "tokyo", "paris"]) ? ? array([2, 2, 1]...) ? ? >>> list(le.inverse_transform([2, 2, 1])) ? ? ['tokyo', 'tokyo', 'paris'] ? ?? ? ? See also ? ? -------- ? ? sklearn.preprocessing.OrdinalEncoder : Encode categorical features using an ordinal encoding scheme. ? ? sklearn.preprocessing.OneHotEncoder : Encode categorical features as a one-hot numeric array. | . .versionadded:: 0.12 ? ? ? >>> from sklearn import preprocessing
? ? ? >>> le = preprocessing.LabelEncoder()
|
| ? ? """ ? ? ? ? Parameters ? ? ? ? Returns ? ? ? ? Parameters ? ? ? ? Returns ? ? ? ? Parameters ? ? ? ? Returns ? ? ? ? Parameters ? ? ? ? Returns | ? |
?
Methods
| fit(y) | Fit label encoder |
| fit_transform(y) | Fit label encoder and return encoded labels |
| get_params([deep]) | Get parameters for this estimator. |
| inverse_transform(y) | Transform labels back to original encoding. |
| set_params(**params) | Set the parameters of this estimator. |
| transform(y) | Transform labels to normalized encoding. |
?
?
LabelEncoder函數(shù)的使用方法
import pandas as pd from sklearn.preprocessing import LabelEncoder from DataScienceNYY.DataAnalysis import dataframe_fillAnyNull,Dataframe2LabelEncoder#構(gòu)造數(shù)據(jù) train_data_dict={'Name':['張三','李四','王五','趙六','張七','李八','王十','un'],'Age':[22,23,24,25,22,22,22,None],'District':['北京','上海','廣東','深圳','山東','河南','浙江',' '],'Job':['CEO','CTO','CFO','COO','CEO','CTO','CEO','']} test_data_dict={'Name':['張三','李四','王十一',None],'Age':[22,23,22,'un'],'District':['北京','上海','廣東',''],'Job':['CEO','CTO','UFO',' ']} train_data_df = pd.DataFrame(train_data_dict) test_data_df = pd.DataFrame(test_data_dict) print(train_data_df,'\n',test_data_df)#缺失數(shù)據(jù)填充 for col in train_data_df.columns:train_data_df[col]=dataframe_fillAnyNull(train_data_df,col)test_data_df[col]=dataframe_fillAnyNull(test_data_df,col) print(train_data_df,'\n',test_data_df)#數(shù)據(jù)LabelEncoder化 train_data,test_data=Dataframe2LabelEncoder(train_data_df,test_data_df) print(train_data,'\n',test_data)?
?
?
?
LabelEncoder函數(shù)的具體案例
1、基礎(chǔ)案例
LabelEncoder can be used to normalize labels.>>> >>> from sklearn import preprocessing >>> le = preprocessing.LabelEncoder() >>> le.fit([1, 2, 2, 6]) LabelEncoder() >>> le.classes_ array([1, 2, 6]) >>> le.transform([1, 1, 2, 6]) array([0, 0, 1, 2]...) >>> le.inverse_transform([0, 0, 1, 2]) array([1, 1, 2, 6]) It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.>>> >>> le = preprocessing.LabelEncoder() >>> le.fit(["paris", "paris", "tokyo", "amsterdam"]) LabelEncoder() >>> list(le.classes_) ['amsterdam', 'paris', 'tokyo'] >>> le.transform(["tokyo", "tokyo", "paris"]) array([2, 2, 1]...) >>> list(le.inverse_transform([2, 2, 1])) ['tokyo', 'tokyo', 'paris']?
?
?
2、在數(shù)據(jù)缺失和test數(shù)據(jù)內(nèi)存在新值(train數(shù)據(jù)未出現(xiàn)過)環(huán)境下的數(shù)據(jù)LabelEncoder化
參考文章:Python之sklearn:LabelEncoder函數(shù)的使用方法之使用LabelEncoder之前的必要操作
import numpy as np from sklearn.preprocessing import LabelEncoder#訓(xùn)練train數(shù)據(jù) LE= LabelEncoder() LE.fit(train_df[col])#test數(shù)據(jù)中的新值添加到LE.classes_ test_df[col] =test_df[col].map(lambda s:'Unknown' if s not in LE.classes_ else s) LE.classes_ = np.append(LE.classes_, 'Unknown') #分別轉(zhuǎn)化train、test數(shù)據(jù) train_df[col] = LE.transform(train_df[col]) test_df[col] = LE.transform(test_df[col])?
?
?
?
?
?
?
?
?
?
?
?
?
總結(jié)
以上是生活随笔為你收集整理的Python之sklearn:LabelEncoder函数简介(编码与编码还原)、使用方法、具体案例之详细攻略的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Python之pandas:pandas
- 下一篇: Python之spyder-kernel