lstm 做航迹预测预测_用lstm预测酒店收入的第一步
lstm 做航跡預測預測
Note: This is an update to my previous article Forecasting Average Daily Rate Trends for Hotels Using LSTM. I since recognised a couple of technical errors in the original analysis, and decided to write a new article to address these and expand my prior analysis.
注意:這是對我以前的文章 使用LSTM預測酒店平均每日房價趨勢 的更新 。 從那以后,我認識到原始分析中的幾個技術錯誤,因此決定寫一篇新文章來解決這些問題并擴展我的先前分析。
背景 (Background)
The purpose of using an LSTM model in this instance is to forecast ADR (average daily rate) for a hotel.
在這種情況下,使用LSTM模型的目的是預測酒店的ADR(平均每日房價)。
ADR is calculated as follows:
ADR計算如下:
ADR = Revenue ÷ sold roomsIn this example, the average ADR for customers per week is calculated and formulated into a time series. The LSTM model is then used to forecast this metric on a week-by-week basis.
在此示例中,每周客戶的平均ADR被計算并制定為時間序列。 然后,將LSTM模型用于每周一次預測該指標。
The original study by Antonio, Almeida and Nunes (2016) can be found here.
Antonio,Almeida和Nunes(2016)的原始研究可在此處找到。
Using pandas, the average ADR is calculated per week. Here is a plot of the weekly ADR trend.
使用大熊貓,每周平均ADR會計算出來。 這是每周ADR趨勢圖。
Source: Jupyter Notebook Output資料來源:Jupyter Notebook輸出Note that the Jupyter Notebook for this example is available at the end of this article.
請注意,本文結尾處提供了此示例的Jupyter Notebook。
資料準備 (Data Preparation)
1.使用MinMaxScaler規(guī)范化數據 (1. Normalizing data with MinMaxScaler)
As with any neural network, the data needs to be scaled for proper interpretation by the network, a process known as normalization. MinMaxScaler is used for this purpose.
與任何神經網絡一樣,需要對數據進行縮放以由網絡進行適當的解釋,這一過程稱為規(guī)范化。 MinMaxScaler用于此目的。
However, this comes with a caveat. Scaling must be done after the data has been split into training, validation and test sets — with each being scaled separately. A common mistake when first using the LSTM (I made this mistake myself) is to first normalize the data before splitting the data.
但是,這帶有警告。 在將數據劃分為訓練集,驗證集和測試集之后 ,必須進行縮放-分別對每個縮放。 第一次使用LSTM時(我自己犯了這個錯誤),一個常見的錯誤是在拆分數據之前先對數據進行規(guī)范化。
The reason this is erroneous is that the normalization technique will use data from the validation and test sets as a reference point when scaling the data as a whole. This will inadvertently influence the values of the training data, essentially resulting in data leakage from the validation and test sets.
這是錯誤的原因是,在對數據進行整體縮放時,規(guī)范化技術將使用來自驗證和測試集的數據作為參考點。 這將無意中影響訓練數據的值,從而實質上導致驗證和測試集的數據泄漏。
In this regard, 100 data points are split into training and validation sets, with the last 15 data points being held as test data for comparison with the LSTM predictions.
在這方面,將100個數據點分為訓練和驗證集,最后15個數據點作為測試數據保存,以與LSTM預測進行比較。
train_size = int(len(df) * 0.8)val_size = len(df) - train_size
train, val = df[0:train_size,:], df[train_size:len(df),:]
A dataset matrix is formed:
形成一個數據集矩陣:
def create_dataset(df, previous=1):dataX, dataY = [], []
for i in range(len(df)-previous-1):
a = df[i:(i+previous), 0]
dataX.append(a)
dataY.append(df[i + previous, 0])
return np.array(dataX), np.array(dataY)
At this point, the training data can be scaled as follows:
此時,訓練數據可以按如下比例縮放:
scaler = MinMaxScaler(feature_range=(0, 1))train = scaler.fit_transform(train)
train
Here is a sample of the output:
這是輸出示例:
array([[0.35915778],[0.42256282],
[0.53159902],
...
[0.0236608 ],
[0.11987636],
[0.48651694]])
Similarly, the validation data is scaled in the same way:
同樣,驗證數據的縮放方式也相同:
val = scaler.fit_transform(val)val
2.定義回溯期 (2. Define lookback period)
A “l(fā)ookback period” defines how many previous timesteps are used in order to predict the subsequent timestep. In this regard, we are using a one-step prediction model.
“回溯期”定義了使用多少個先前的時間步長來預測隨后的時間步長。 在這方面,我們使用了一個單步預測模型。
The lookback period is set to 5 in this instance. This means that we are using the time steps at t-4, t-3, t-2, t-1, and t to predict the value at time t+1.
在這種情況下,回溯期設置為5 。 這意味著我們將使用t-4,t-3,t-2,t-1和t處的時間步長來預測時間t + 1處的值。
# Lookback periodlookback = 5
X_train, Y_train = create_dataset(train, lookback)
X_val, Y_val = create_dataset(val, lookback)
Note that the selection of the lookback period is quite an arbitrary process. In this instance, a lookback window of 5 was shown to demonstrate the best predictive performance on the test set. However, another option could be to use the number of lags as indicated by PACF to set the size of the lookback window, as described at Data Science Stack Exchange.
注意,回溯期的選擇是一個相當隨意的過程。 在這種情況下,回溯窗口顯示為5,以證明測試集具有最佳的預測性能。 但是,另一個選擇可能是使用PACF指示的延遲數來設置回溯窗口的大小,如Data Science Stack Exchange中所述 。
Let’s take a look at the normalized window for X_train.
讓我們看一下X_train的標準化窗口。
array([[0.35915778, 0.42256282, 0.53159902, 0.6084246 , 0.63902841],[0.42256282, 0.53159902, 0.6084246 , 0.63902841, 0.70858066],
[0.53159902, 0.6084246 , 0.63902841, 0.70858066, 0.75574219],
...
Here are the first three entries. We can see that the five time steps immediately prior to the one we are trying to predict move in a stepwise motion.
這是前三個條目。 我們可以看到緊接我們試圖預測逐步運動的五個時間步長。
For instance, the first entry shows 0.63902841 at time t. In the second entry, this value now moves backwards to time t-1.
例如,第一項在時間t顯示0.63902841。 在第二個條目中,此值現在向后移到時間t-1 。
Let’s give an example applicable to this situation. For a hotel that wishes to predict the ADR value in week 26 for instance, the hotel will use this model to make the prediction in the prior week using data for weeks 21, 22, 23, 24, and 25.
讓我們舉一個適用于這種情況的例子。 例如,對于希望在第26周預測ADR值的酒店,該酒店將使用該模型使用第21、22、23、24和25周的數據在前一周進行預測。
Now, the input is reshaped into a [samples, time steps, features] format.
現在,輸入被重塑為[樣本,時間步長,特征]格式。
# reshape input to be [samples, time steps, features]X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_val = np.reshape(X_val, (X_val.shape[0], 1, X_val.shape[1]))
In this case, the shape of the input is [74, 1, 1].
在這種情況下,輸入的形狀為[74,1,1] 。
74 samples are present in the training data, the model is operating on a time step of 1, and 1 feature is being used in the model, i.e. a lagged version of the time series.
訓練數據中存在74個樣本,該模型以1的時間步長運行,并且模型中使用了1個特征,即時間序列的滯后版本。
LSTM建模 (LSTM Modelling)
An LSTM model is defined as follows:
LSTM模型定義如下:
# Generate LSTM networkmodel = tf.keras.Sequential()
model.add(LSTM(4, input_shape=(1, lookback)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
history=model.fit(X_train, Y_train, validation_split=0.2, epochs=100, batch_size=1, verbose=2)
An LSTM model is created with 4 neurons. The mean squared error is being used as the loss function — given that we are dealing with a regression problem. Additionally, the adam optimizer is used, with training done over 100 epochs with a validation split of 20%.
用4個神經元創(chuàng)建一個LSTM模型。 考慮到我們正在處理回歸問題,均方誤差被用作損失函數。 此外,使用了adam優(yōu)化程序,訓練了100多個紀元,驗證間隔為20%。
Here is a visual overview of the training and validation loss:
這是培訓和驗證損失的直觀概述:
# list all data in historyprint(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()
We can see that after an initial increase in the validation loss, the loss starts to decrease after about 10 epochs.
我們可以看到,在最初增加驗證損失后,損失大約在10個紀元后開始減少。
Now, the predictions are converted back to the original scale:
現在,將預測轉換回原始比例:
# Convert predictions back to normal valuestrainpred = scaler.inverse_transform(trainpred)
Y_train = scaler.inverse_transform([Y_train])
valpred = scaler.inverse_transform(valpred)
Y_val = scaler.inverse_transform([Y_val])
predictions = valpred
The root mean squared error is calculated on the training and validation set:
均方根誤差是根據訓練和驗證集計算得出的:
# calculate RMSEtrainScore = math.sqrt(mean_squared_error(Y_train[0], trainpred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
valScore = math.sqrt(mean_squared_error(Y_val[0], valpred[:,0]))
print('Validation Score: %.2f RMSE' % (valScore))
The obtained RMSE values are as follows:
獲得的RMSE值如下:
Train error: 3.88 RMSE
火車錯誤: 3.88 RMSE
Validation error: 8.78 RMSE
驗證錯誤: 8.78 RMSE
With a mean ADR value of 69.99 across the validation set, the validation error is quite small in comparison (roughly 12% of the mean value), indicating that the model has done a good job at forecasting ADR values.
整個驗證集的平均ADR值為69.99,相比而言,驗證誤差很小(約為平均值的12%),這表明該模型在預測ADR值方面做得很好。
Here is a plot of the forecasted versus actual ADR values across the training and validation set.
這是整個訓練和驗證集的預測ADR值與實際ADR值的關系圖。
# Plot all predictionsinversetransform, =plt.plot(scaler.inverse_transform(df))
trainpred, =plt.plot(scaler.inverse_transform(trainpredPlot))
valpred, =plt.plot(scaler.inverse_transform(valpredPlot))
plt.xlabel('Number of weeks')
plt.ylabel('Cancellations')
plt.title("Predicted vs. Actual Weekly ADR")
plt.show()Source: Jupyter Notebook Output資料來源:Jupyter Notebook輸出
We can see that the LSTM model is generally capturing the directional oscillations of the time series. However, during periods of extreme spikes in ADR, e.g. week 60, the model seems to perform less well.
我們可以看到LSTM模型通常捕獲了時間序列的方向性振蕩。 但是,在ADR急劇上升的時期內(例如第60周),該模型的效果似乎不太好。
However, in order to fully determine whether the model has predictive power — it will now be used to predict the last 15 time steps in the series, i.e. the test data.
但是,為了完全確定模型是否具有預測能力,現在將其用于預測序列中的最后15個時間步長,即測試數據。
Xnew = np.array([tseries.iloc[95:100],tseries.iloc[96:101],tseries.iloc[97:102],tseries.iloc[98:103],tseries.iloc[99:104],tseries.iloc[100:105],tseries.iloc[101:106],tseries.iloc[102:107],tseries.iloc[103:108],tseries.iloc[104:109],tseries.iloc[105:110],tseries.iloc[106:111],tseries.iloc[107:112],tseries.iloc[108:113],tseries.iloc[109:114]])In this example, Xnew uses the previous five time steps to predict at time t+1. For instance, weeks 95 to 100 are used to predict the ADR value for week 101, then weeks 96 to 101 are used to predict week 102, and so on.
在此示例中,Xnew使用前五個時間步長預測時間t + 1 。 例如,第95到100周用于預測第101周的ADR值,然后第96到101周用于預測第102周,依此類推。
Source: Jupyter Notebook Output資料來源:Jupyter Notebook輸出The above graph illustrates the LSTM predictions versus the actual ADR values in the test set — the last 15 points in the series.
上圖顯示了LSTM預測與測試集中實際ADR值的對比-系列中的最后15個點。
The obtained RMSE and MAE (mean absolute error) values are as follows:
獲得的RMSE和MAE(平均絕對誤差)值如下:
MAE: -27.65
湄: -27.65
RMSE: 31.91
RMSE: 31.91
The RMSE error for the test set is significantly higher than that for the validation set — which would be expected since we are working with unseen data.
測試集的RMSE錯誤明顯高于驗證集的誤差-由于我們正在使用看不見的數據,因此這是可以預期的。
However, with a mean ADR value of 160 across the test set, the RMSE error is approximately 20% of the size of the mean value, indicating that the LSTM does still have reasonably strong predictive power in determining the value of the next timestep.
但是,如果整個測試集的平均ADR值為160 ,則RMSE誤差約為平均值大小的20%,這表明LSTM在確定下一時間步長的值時仍具有相當強的預測能力。
Ideally, one would like to use a significantly larger data sample to validate whether the LSTM would retain predictive power across new data. Additionally, as illustrated in this Reddit thread, LSTMs can be prone to overfitting depending on the size of the data sample.
理想情況下,我們希望使用大得多的數據樣本來驗證LSTM是否將保留對新數據的預測能力。 此外,如此Reddit線程中所示 ,根據數據樣本的大小,LSTM可能易于過度擬合。
In this regard, a larger data sample is needed to validate if this model would work in a real-world scenario. However, the preliminary results in this case look promising.
在這方面,需要更大的數據樣本來驗證此模型是否可以在實際場景中使用。 但是,這種情況下的初步結果看起來很有希望。
結論 (Conclusion)
In this example, you have seen:
在此示例中,您已經看到:
- How to properly format data to work with an LSTM model 如何正確格式化數據以使用LSTM模型
- Building of a one-step LSTM predictive model 建立一步式LSTM預測模型
- Interpretation of RMSE and MAE values to determine model accuracy 解釋RMSE和MAE值以確定模型準確性
Many thanks for reading, and any feedback or questions are greatly appreciated. You can find the Jupyter Notebook for this example here.
非常感謝您的閱讀,非常感謝您提供任何反饋或問題。 你可以找到Jupyter筆記本這個例子在這里 。
Additionally, I also highly recommend this tutorial by Machine Learning Mastery, which was used as a guideline for designing the LSTM model used in this example.
另外,我也強烈推薦Machine Learning Mastery撰寫的本教程,該教程被用作設計本示例中使用的LSTM模型的指南。
Disclaimer: This article is written on an “as is” basis and without warranty. It was written with the intention of providing an overview of data science concepts, and should not be interpreted as professional advice in any way.
免責聲明:本文按“原樣”撰寫,不作任何擔保。 它旨在提供數據科學概念的概述,并且不應以任何方式解釋為專業(yè)建議。
翻譯自: https://towardsdatascience.com/one-step-predictions-with-lstm-forecasting-hotel-revenues-c9ef0d3ef2df
lstm 做航跡預測預測
總結
以上是生活随笔為你收集整理的lstm 做航迹预测预测_用lstm预测酒店收入的第一步的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Oracle中用rownum替代Top函
- 下一篇: 主机名排序