机器学习中的OOF
[1]中回答如下:
OF simply stands for "Out-of-fold" and refers to a step in the learning process when using k-fold validation in which the predictions from each set of folds are grouped together into one group of 1000 predictions. These predictions are now "out-of-the-folds" and thus error can be calculated on these to get a good measure of how good your model is.
In terms of learning more about it, there's really not a ton more to it than that, and it certainly isn't its own technique to learning or anything. If you have a follow up question that is small, please leave a comment and I will try and update my answer to include this.
EDIT:?While ambling around the inter-webs I stumbled upon?this[2]?relatively similar question from Cross-Validated (with a slightly more detailed answer), perhaps it will add some intuition if you are still confused.
[2]中回答如下:
When training on each fold (90%) of the data, you will then predict on the remaining 10%. With this 10% you will compute an error metric (RMSE, for example). This leaves you with: 10 values for RMSE, and 10 sets of corresponding predictions. There are 2 things to do this these results:
Inspect the mean and standard deviation of your 10 RMSE values. k-fold takes random partitions of your data, and the error on each fold should not vary too greatly. If it does, your model (and its features, hyper-parameters etc.) cannot be expected to yield stable predictions on a test set.
Aggregate your 10 sets of predictions into 1 set of predictions. For example, if your training set contains 1,000 data points, you will have 10 sets of 100 predictions (10*100 = 1000). When you stack these into 1 vector, you are now left with 1000 predictions: 1 for every observation in your original training set. These are called out-of-folds predictions. With these, you can compute the RMSE for your whole training set in one go, as?rmse = compute_rmse(oof_predictions, y_train). This is the likely the cleanest way to evaluate the final predictor.
?
一句話就是,進行10折驗證的時候,假如訓練集1000條:
十折cv,10個模型,每個模型都是由900條訓練集訓練而成,對剩下的100條進行預測,10個模型都對各自剩下的100條進行預測,這個就叫做OOF?predictions
[1]https://stackoverflow.com/questions/52396191/what-is-oof-approach-in-machine-learning
[2]https://stats.stackexchange.com/questions/161491/how-to-evaluate-the-final-model-after-k-fold-cross-validation
總結
- 上一篇: 欧司朗台灯的灯泡容易坏是怎么回事
- 下一篇: awk输出csv的一列