【数据竞赛】Kaggle秘技,用Sigmoid函数做回归问题!
作者:? 塵沙櫻落,杰少
基于Sigmoid的回歸損失函數設計
背景
這是一個非常有意思的Loss設計,在你的問題是回歸問題的時候,都可以考慮嘗試使用一下,并不能保證所有的問題都能奏效,但是在某些特定的問題中卻可以帶來巨大的提升,最不濟也可以作為一個用于后期stacking的方案。
該方案是設計者是:數據科學家danzel ,作者對于該設計奏效的原因描述如下,
I used a sigmoid-output and scaled its range afterwards (to look like the target). Training like this helps the model to converge faster and gives better results.
設計思路
假設對于我們的回歸的問題為最小化平方損失,而且我們第個標簽為,
, 為我們的樣本個數;
1. Baseline Loss
一般都是Dense(1,activation = 'linear')的形式
2. 基于Sigmoid的Loss
是Dense(1,activation = 'sigmoid') * (max_val - min_val) + min_val的形式;
,
案例
上面說的究竟靠譜不靠譜呢?我們摘取kaggle數據進行實驗,眼見為真。有興趣的朋友可以去文末鏈接下載。
1.導入工具包
1.1導入使用的工具包
import?pandas????????????????as?pd? from?sklearn.metrics?????????import?mean_squared_error from?sklearn.model_selection?import?KFold import?xgboost???????????????as?xgb from???tqdm??????????????????import?tqdm import?numpy?????????????????as?np import?pandas????????????????as?pd? import?tensorflow????????????as?tf? from?lightgbm????????????????import?LGBMRegressor from?sklearn.model_selection?import?KFold import?numpy?????????????????as?np import?seaborn???????????????as?sns from?sklearn.metrics?????????import?mean_squared_errordef?RMSE(y_true,?y_pred):return?tf.sqrt(tf.reduce_mean(tf.square(y_true?-?y_pred)))1.2 數據讀取
train?=?pd.read_csv('./data/train.csv') test??=?pd.read_csv('./data/test.csv') sub???=?pd.read_csv('./data/sample_submission.csv')2. 數據預處理
2.1 數據拼接
train_test?=?pd.concat([train,test],axis=0,ignore_index=True) train_test.head()| 1 | 0.670390 | 0.811300 | 0.643968 | 0.291791 | 0.284117 | 0.855953 | 0.890700 | 0.285542 | 0.558245 | 0.779418 | 0.921832 | 0.866772 | 0.878733 | 0.305411 | 7.243043 |
| 3 | 0.388053 | 0.621104 | 0.686102 | 0.501149 | 0.643790 | 0.449805 | 0.510824 | 0.580748 | 0.418335 | 0.432632 | 0.439872 | 0.434971 | 0.369957 | 0.369484 | 8.203331 |
| 4 | 0.834950 | 0.227436 | 0.301584 | 0.293408 | 0.606839 | 0.829175 | 0.506143 | 0.558771 | 0.587603 | 0.823312 | 0.567007 | 0.677708 | 0.882938 | 0.303047 | 7.776091 |
| 5 | 0.820708 | 0.160155 | 0.546887 | 0.726104 | 0.282444 | 0.785108 | 0.752758 | 0.823267 | 0.574466 | 0.580843 | 0.769594 | 0.818143 | 0.914281 | 0.279528 | 6.957716 |
| 8 | 0.935278 | 0.421235 | 0.303801 | 0.880214 | 0.665610 | 0.830131 | 0.487113 | 0.604157 | 0.874658 | 0.863427 | 0.983575 | 0.900464 | 0.935918 | 0.435772 | 7.951046 |
2.2. 用于神經網絡預處理的GaussianRank
如果希望知道細節,可以參考之前分享的RankGaussian的部分
2.3 RankGaussian處理
feature_names?=?['cont1',?'cont2',?'cont3',?'cont4',?'cont5',?'cont6',?'cont7','cont8',?'cont9',?'cont10',?'cont11',?'cont12',?'cont13',?'cont14'] scaler_linear????=?GaussRankScaler(interp_kind='linear',)? for?c?in?feature_names:train_test[c+'_linear_grank']?=?scaler_linear.fit_transform(train_test[c].values.reshape(-1,1))gaussian_linear_feature_names?=?[c?+?'_linear_grank'?for?c?in?feature_names]3. NN模型建模
from?tensorflow.keras?import?regularizers from?sklearn.model_selection?import?KFold,?StratifiedKFold import?tensorflow?as?tf #?import?tensorflow_addons?as?tfa import?tensorflow.keras.backend?as?K from?tensorflow.keras.layers?import?* from?tensorflow.keras.models?import?* from?tensorflow.keras.optimizers?import?* from?tensorflow.keras.callbacks?import?* from?tensorflow.keras.layers?import?Input import?os?3.1 訓練&驗證劃分
隨機劃分訓練集和驗證集
3.2 MLP模型(sigmoid):0.7108
基于sigmoid的回歸
3.3 MLP模型(linear):0.7137
class?MLP_Model(tf.keras.Model):def?__init__(self):super(MLP_Model,?self).__init__()?self.dense1?=Dense(1000,?activation='relu')??self.drop1??=?Dropout(0.25)self.dense2?=Dense(500,?activation='relu')?self.drop2??=?Dropout(0.25)?self.dense_out?=Dense(1)def?call(self,?inputs):?x1??????=?self.dense1(inputs)x1??????=?self.drop1(x1)x2??????=?self.dense2(x1)x2??????=?self.drop2(x2)outputs?=?self.dense_out(x2)?return?outputsmodel?=?MLP_Model() adam?=?tf.optimizers.Adam(lr=1e-3)? model.compile(optimizer=adam,?loss=RMSE)K.clear_session()? model_weights?=?f'./models/model_gauss_mlp_mlp.h5' checkpoint?=?ModelCheckpoint(model_weights,?monitor='loss',?verbose=0,?save_best_only=True,?mode='min',save_weights_only=True) plateau????????=?ReduceLROnPlateau(monitor='val_loss',?factor=0.5,?patience=10,?verbose=1,?min_delta=1e-4,?mode='min') early_stopping?=?EarlyStopping(monitor="val_loss",?patience=25) history?=?model.fit(X_tr_dnn_linear_gaussian.values,?y_tr.values,validation_data=(X_val_dnn_linear_gaussian.values,?y_val.values),batch_size=1024,?epochs=100,callbacks=[plateau,?checkpoint,?early_stopping],verbose=2)? WARNING:tensorflow:Entity <bound method MLP_Model.call of <__main__.MLP_Model object at 0x7f48083f7d50>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4 WARNING: Entity <bound method MLP_Model.call of <__main__.MLP_Model object at 0x7f48083f7d50>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4 Train on 240000 samples, validate on 60000 samples Epoch 1/100 WARNING:tensorflow:Entity <function Function._initialize_uninitialized_variables.<locals>.initialize_variables at 0x7f4818c487a0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Num' WARNING: Entity <function Function._initialize_uninitialized_variables.<locals>.initialize_variables at 0x7f4818c487a0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Num' 240000/240000 - 1s - loss: 1.3292 - val_loss: 0.7767 Epoch 2/100 240000/240000 - 0s - loss: 0.8163 - val_loss: 0.7251 Epoch 3/100 240000/240000 - 0s - loss: 0.8072 - val_loss: 0.7251 Epoch 4/100 240000/240000 - 0s - loss: 0.8040 - val_loss: 0.7496 Epoch 5/100 240000/240000 - 0s - loss: 0.7997 - val_loss: 0.7324 Epoch 6/100 240000/240000 - 0s - loss: 0.7982 - val_loss: 0.7271 Epoch 7/100 240000/240000 - 0s - loss: 0.7936 - val_loss: 0.7202 Epoch 8/100 240000/240000 - 0s - loss: 0.7950 - val_loss: 0.7249 Epoch 9/100 240000/240000 - 0s - loss: 0.7914 - val_loss: 0.7284 Epoch 10/100 240000/240000 - 0s - loss: 0.7882 - val_loss: 0.7313 Epoch 11/100 240000/240000 - 0s - loss: 0.7886 - val_loss: 0.7303 Epoch 12/100 240000/240000 - 0s - loss: 0.7857 - val_loss: 0.7292 Epoch 13/100 240000/240000 - 0s - loss: 0.7855 - val_loss: 0.7257 Epoch 14/100 240000/240000 - 0s - loss: 0.7847 - val_loss: 0.7204 Epoch 15/100 240000/240000 - 0s - loss: 0.7825 - val_loss: 0.7224 Epoch 16/100 240000/240000 - 0s - loss: 0.7813 - val_loss: 0.7220 Epoch 17/100Epoch 00017: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257. 240000/240000 - 0s - loss: 0.7808 - val_loss: 0.7208 Epoch 18/100 240000/240000 - 0s - loss: 0.7752 - val_loss: 0.7187 Epoch 19/100 240000/240000 - 0s - loss: 0.7743 - val_loss: 0.7234 Epoch 20/100 240000/240000 - 0s - loss: 0.7730 - val_loss: 0.7190 Epoch 21/100 240000/240000 - 0s - loss: 0.7750 - val_loss: 0.7196 Epoch 22/100 240000/240000 - 0s - loss: 0.7742 - val_loss: 0.7286 Epoch 23/100 240000/240000 - 0s - loss: 0.7722 - val_loss: 0.7198 Epoch 24/100 240000/240000 - 0s - loss: 0.7720 - val_loss: 0.7227 Epoch 25/100 240000/240000 - 0s - loss: 0.7724 - val_loss: 0.7176 Epoch 26/100 240000/240000 - 0s - loss: 0.7705 - val_loss: 0.7194 Epoch 27/100 240000/240000 - 0s - loss: 0.7689 - val_loss: 0.7206 Epoch 28/100 240000/240000 - 0s - loss: 0.7696 - val_loss: 0.7168 Epoch 29/100 240000/240000 - 0s - loss: 0.7695 - val_loss: 0.7171 Epoch 30/100 240000/240000 - 0s - loss: 0.7681 - val_loss: 0.7164 Epoch 31/100 240000/240000 - 0s - loss: 0.7676 - val_loss: 0.7225 Epoch 32/100 240000/240000 - 0s - loss: 0.7681 - val_loss: 0.7177 Epoch 33/100 240000/240000 - 0s - loss: 0.7660 - val_loss: 0.7198 Epoch 34/100 240000/240000 - 0s - loss: 0.7668 - val_loss: 0.7202 Epoch 35/100 240000/240000 - 0s - loss: 0.7653 - val_loss: 0.7160 Epoch 36/100 240000/240000 - 0s - loss: 0.7647 - val_loss: 0.7248 Epoch 37/100 240000/240000 - 0s - loss: 0.7638 - val_loss: 0.7173 Epoch 38/100 240000/240000 - 0s - loss: 0.7626 - val_loss: 0.7197 Epoch 39/100 240000/240000 - 0s - loss: 0.7624 - val_loss: 0.7182 Epoch 40/100 240000/240000 - 0s - loss: 0.7615 - val_loss: 0.7195 Epoch 41/100 240000/240000 - 0s - loss: 0.7621 - val_loss: 0.7195 Epoch 42/100 240000/240000 - 0s - loss: 0.7616 - val_loss: 0.7192 Epoch 43/100 240000/240000 - 0s - loss: 0.7604 - val_loss: 0.7162 Epoch 44/100 240000/240000 - 0s - loss: 0.7592 - val_loss: 0.7152 Epoch 45/100 240000/240000 - 0s - loss: 0.7600 - val_loss: 0.7193 Epoch 46/100 240000/240000 - 0s - loss: 0.7594 - val_loss: 0.7206 Epoch 47/100 240000/240000 - 0s - loss: 0.7578 - val_loss: 0.7201 Epoch 48/100 240000/240000 - 0s - loss: 0.7583 - val_loss: 0.7164 Epoch 49/100 240000/240000 - 0s - loss: 0.7581 - val_loss: 0.7163 Epoch 50/100 240000/240000 - 0s - loss: 0.7572 - val_loss: 0.7163 Epoch 51/100 240000/240000 - 0s - loss: 0.7554 - val_loss: 0.7166 Epoch 52/100 240000/240000 - 0s - loss: 0.7564 - val_loss: 0.7212 Epoch 53/100 240000/240000 - 0s - loss: 0.7560 - val_loss: 0.7156 Epoch 54/100Epoch 00054: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628. 240000/240000 - 0s - loss: 0.7547 - val_loss: 0.7180 Epoch 55/100 240000/240000 - 0s - loss: 0.7530 - val_loss: 0.7154 Epoch 56/100 240000/240000 - 0s - loss: 0.7534 - val_loss: 0.7150 Epoch 57/100 240000/240000 - 0s - loss: 0.7531 - val_loss: 0.7148 Epoch 58/100 240000/240000 - 0s - loss: 0.7530 - val_loss: 0.7156 Epoch 59/100 240000/240000 - 0s - loss: 0.7523 - val_loss: 0.7166 Epoch 60/100 240000/240000 - 0s - loss: 0.7522 - val_loss: 0.7152 Epoch 61/100 240000/240000 - 0s - loss: 0.7520 - val_loss: 0.7155 Epoch 62/100 240000/240000 - 0s - loss: 0.7514 - val_loss: 0.7148 Epoch 63/100 240000/240000 - 0s - loss: 0.7514 - val_loss: 0.7149 Epoch 64/100 240000/240000 - 0s - loss: 0.7506 - val_loss: 0.7156 Epoch 65/100 240000/240000 - 0s - loss: 0.7508 - val_loss: 0.7150 Epoch 66/100 240000/240000 - 0s - loss: 0.7516 - val_loss: 0.7154 Epoch 67/100Epoch 00067: ReduceLROnPlateau reducing learning rate to 0.0001250000059371814. 240000/240000 - 0s - loss: 0.7507 - val_loss: 0.7153 Epoch 68/100 240000/240000 - 0s - loss: 0.7502 - val_loss: 0.7149 Epoch 69/100 240000/240000 - 0s - loss: 0.7497 - val_loss: 0.7147 Epoch 70/100 240000/240000 - 0s - loss: 0.7496 - val_loss: 0.7148 Epoch 71/100 240000/240000 - 0s - loss: 0.7502 - val_loss: 0.7142 Epoch 72/100 240000/240000 - 0s - loss: 0.7492 - val_loss: 0.7148 Epoch 73/100 240000/240000 - 0s - loss: 0.7487 - val_loss: 0.7148 Epoch 74/100 240000/240000 - 0s - loss: 0.7485 - val_loss: 0.7143 Epoch 75/100 240000/240000 - 0s - loss: 0.7496 - val_loss: 0.7154 Epoch 76/100 240000/240000 - 0s - loss: 0.7482 - val_loss: 0.7144 Epoch 77/100 240000/240000 - 0s - loss: 0.7488 - val_loss: 0.7142 Epoch 78/100 240000/240000 - 0s - loss: 0.7492 - val_loss: 0.7145 Epoch 79/100 240000/240000 - 0s - loss: 0.7483 - val_loss: 0.7143 Epoch 80/100 240000/240000 - 0s - loss: 0.7478 - val_loss: 0.7143 Epoch 81/100Epoch 00081: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05. 240000/240000 - 0s - loss: 0.7481 - val_loss: 0.7143 Epoch 82/100 240000/240000 - 0s - loss: 0.7480 - val_loss: 0.7146 Epoch 83/100 240000/240000 - 0s - loss: 0.7477 - val_loss: 0.7141 Epoch 84/100 240000/240000 - 0s - loss: 0.7471 - val_loss: 0.7139 Epoch 85/100 240000/240000 - 0s - loss: 0.7475 - val_loss: 0.7140 Epoch 86/100 240000/240000 - 0s - loss: 0.7473 - val_loss: 0.7141 Epoch 87/100 240000/240000 - 0s - loss: 0.7469 - val_loss: 0.7141 Epoch 88/100 240000/240000 - 0s - loss: 0.7474 - val_loss: 0.7148 Epoch 89/100 240000/240000 - 0s - loss: 0.7467 - val_loss: 0.7138 Epoch 90/100 240000/240000 - 0s - loss: 0.7466 - val_loss: 0.7142 Epoch 91/100 240000/240000 - 0s - loss: 0.7460 - val_loss: 0.7141 Epoch 92/100 240000/240000 - 0s - loss: 0.7465 - val_loss: 0.7138 Epoch 93/100 240000/240000 - 0s - loss: 0.7469 - val_loss: 0.7142 Epoch 94/100 240000/240000 - 0s - loss: 0.7467 - val_loss: 0.7141 Epoch 95/100 240000/240000 - 0s - loss: 0.7465 - val_loss: 0.7148 Epoch 96/100 240000/240000 - 0s - loss: 0.7465 - val_loss: 0.7138 Epoch 97/100 240000/240000 - 0s - loss: 0.7461 - val_loss: 0.7138 Epoch 98/100 240000/240000 - 0s - loss: 0.7456 - val_loss: 0.7140 Epoch 99/100Epoch 00099: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05. 240000/240000 - 0s - loss: 0.7463 - val_loss: 0.7139 Epoch 100/100 240000/240000 - 0s - loss: 0.7461 - val_loss: 0.7137參考文獻
https://www.kaggle.com/c/tabular-playground-series-jan-2021/data
https://www.kaggle.com/c/tabular-playground-series-jan-2021/discussion/216037
總結
以上是生活随笔為你收集整理的【数据竞赛】Kaggle秘技,用Sigmoid函数做回归问题!的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 火狐浏览器账号登录步骤详解
- 下一篇: 爱奇艺视频播放怎么开加速