當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

学习笔记 Keras:常见问题

發布時間：2025/3/15 编程问答 29 豆豆

生活随笔收集整理的這篇文章主要介紹了学习笔记 Keras:常见问题小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

目錄：
常見問題
如何引用Keras？
如何使Keras調用GPU？
如何在多張GPU卡上使用Keras？
- 數據并行
設備并行
“batch”, “epoch”和”sample”都是啥意思？？
如何保存Keras模型？
為什么訓練誤差比測試誤差高很多？
如何獲取中間層的輸出？
如何利用Keras處理超過機器內存的數據集？
當驗證集的loss不再下降時，如何中斷訓練？
驗證集是如何從訓練集中分割出來的？
驗證集的數據不會被洗亂
如何使用狀態RNN（stateful RNN）？
如何“凍結”網絡的層？
如何從Sequential模型中去除一個層？
如何在Keras中使用預訓練的模型？

本系列參考官方文檔官方文檔
這就是keras可以參考前篇：這就是keras
學習筆記 Keras:一些基本概念一些基本概念

常見問題

Keras FAQ：常見問題

如何引用Keras？如何使Keras調用GPU？如何在多張GPU卡上使用Keras "batch", "epoch"和"sample"都是啥意思？如何保存Keras模型？為什么訓練誤差(loss)比測試誤差高很多？如何獲取中間層的輸出？如何利用Keras處理超過機器內存的數據集？當驗證集的loss不再下降時，如何中斷訓練？驗證集是如何從訓練集中分割出來的？訓練數據在訓練時會被隨機洗亂嗎？如何在每個epoch后記錄訓練/測試的loss和正確率？如何使用狀態RNN（stateful RNN）？如何“凍結”網絡的層？如何從Sequential模型中去除一個層？如何在Keras中使用預訓練的模型如何在Keras中使用HDF5輸入？ Keras的配置文件存儲在哪里？在使用Keras開發過程中，我如何獲得可復現的結果？

如何引用Keras？

如果Keras對你的研究有幫助的話，請在你的文章中引用Keras。這里是一個使用BibTex的例子

@misc{chollet2015keras,author = {Chollet, Fran?ois and others},title = {Keras},year = {2015},publisher = {GitHub},journal = {GitHub repository},howpublished = {\url{https://github.com/fchollet/keras}} }

如何使Keras調用GPU？

如果采用TensorFlow作為后端，當機器上有可用的GPU時，代碼會自動調用GPU進行并行計算。如果使用Theano作為后端，可以通過以下方法設置：

方法1：使用Theano標記

在執行python腳本時使用下面的命令：

THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py

方法2：設置.theano文件

點擊這里查看指導教程

方法3：在代碼的開頭處手動設置theano.config.device和theano.config.floatX

import theanotheano.config.device = 'gpu'theano.config.floatX = 'float32'

如何在多張GPU卡上使用Keras？

我們建議有多張GPU卡可用時，使用TnesorFlow后端。

有兩種方法可以在多張GPU上運行一個模型：數據并行/設備并行

大多數情況下，你需要的很可能是“數據并行”

數據并行

數據并行將目標模型在多個設備上各復制一份，并使用每個設備上的復制品處理整個數據集的不同部分數據。Keras在keras.utils.multi_gpu_model中提供有內置函數，該函數可以產生任意模型的數據并行版本，最高支持在8片GPU上并行。請參考utils中的multi_gpu_model文檔。下面是一個例子：

from keras.utils import multi_gpu_model# Replicates `model` on 8 GPUs. # This assumes that your machine has 8 available GPUs. parallel_model = multi_gpu_model(model, gpus=8) parallel_model.compile(loss='categorical_crossentropy',optimizer='rmsprop')# This `fit` call will be distributed on 8 GPUs. # Since the batch size is 256, each GPU will process 32 samples. parallel_model.fit(x, y, epochs=20, batch_size=256)

設備并行

設備并行是在不同設備上運行同一個模型的不同部分，當模型含有多個并行結構，例如含有兩個分支時，這種方式很適合。

這種并行方法可以通過使用TensorFlow device scopes實現，下面是一個例子：

# Model where a shared LSTM is used to encode two different sequences in parallel input_a = keras.Input(shape=(140, 256)) input_b = keras.Input(shape=(140, 256))shared_lstm = keras.layers.LSTM(64)# Process the first sequence on one GPU with tf.device_scope('/gpu:0'):encoded_a = shared_lstm(tweet_a) # Process the next sequence on another GPU with tf.device_scope('/gpu:1'):encoded_b = shared_lstm(tweet_b)# Concatenate results on CPU with tf.device_scope('/cpu:0'):merged_vector = keras.layers.concatenate([encoded_a, encoded_b],axis=-1)

“batch”, “epoch”和”sample”都是啥意思？？

下面是一些使用keras時常會遇到的概念，我們來簡單解釋。

Sample：樣本，數據集中的一條數據。例如圖片數據集中的一張圖片，語音數據中的一段音頻。Batch：中文為批，一個batch由若干條數據構成。但另一方面，一個batch也只能讓網絡的參數更新一次，因此網絡參數的迭代會較慢。在測試網絡的時候，應該在條件的允許的范圍內盡量使用更大的batch，這樣計算效率會更高。Epoch，epoch可譯為“輪次”。如果說每個batch對應網絡的一次更新的話，一個epoch對應的就是網絡的一輪更新。 Keras中，當指定了驗證集時，每個epoch執行完后都會運行一次驗證集以確定模型的性能。另外，我們可以使用回調函數在每個epoch的訓練前后執行一些操作，如調整學習率，打印目前模型的一些信息等，詳情請參考Callback一節。

如何保存Keras模型？

我們不推薦使用pickle或cPickle來保存Keras模型

你可以使用model.save(filepath)將Keras模型和權重保存在一個HDF5文件中，該文件將包含：

模型的結構，以便重構該模型模型的權重訓練配置（損失函數，優化器等）優化器的狀態，以便于從上次訓練中斷的地方開始

使用keras.models.load_model(filepath)來重新實例化你的模型，如果文件中存儲了訓練配置的話，該函數還會同時完成模型的編譯

例子：

from keras.models import load_modelmodel.save('my_model.h5') # creates a HDF5 file 'my_model.h5' del model # deletes the existing model# returns a compiled model # identical to the previous one model = load_model('my_model.h5')

如果你只是希望保存模型的結構，而不包含其權重或配置信息，可以使用：

# save as JSON json_string = model.to_json()# save as YAML yaml_string = model.to_yaml()

這項操作將把模型序列化為json或yaml文件，這些文件對人而言也是友好的，如果需要的話你甚至可以手動打開這些文件并進行編輯。

當然，你也可以從保存好的json文件或yaml文件中載入模型：

# model reconstruction from JSON: from keras.models import model_from_json model = model_from_json(json_string)# model reconstruction from YAML model = model_from_yaml(yaml_string)

如果需要保存模型的權重，可通過下面的代碼利用HDF5進行保存。注意，在使用前需要確保你已安裝了HDF5和其Python庫h5py

model.save_weights('my_model_weights.h5')

如果你需要在代碼中初始化一個完全相同的模型，請使用：

model.load_weights('my_model_weights.h5')

如果你需要加載權重到不同的網絡結構（有些層一樣）中，例如fine-tune或transfer-learning，你可以通過層名字來加載模型：

model.load_weights('my_model_weights.h5', by_name=True)

例如：

""" 假如原模型為：model = Sequential()model.add(Dense(2, input_dim=3, name="dense_1"))model.add(Dense(3, name="dense_2"))...model.save_weights(fname) """ # new model model = Sequential() model.add(Dense(2, input_dim=3, name="dense_1")) # will be loaded model.add(Dense(10, name="new_dense")) # will not be loaded# load weights from first model; will only affect the first layer, dense_1. model.load_weights(fname, by_name=True)

為什么訓練誤差比測試誤差高很多？

一個Keras的模型有兩個模式：訓練模式和測試模式。一些正則機制，如Dropout，L1/L2正則項在測試模式下將不被啟用。

另外，訓練誤差是訓練數據每個batch的誤差的平均。在訓練過程中，每個epoch起始時的batch的誤差要大一些，而后面的batch的誤差要小一些。另一方面，每個epoch結束時計算的測試誤差是由模型在epoch結束時的狀態決定的，這時候的網絡將產生較小的誤差。

【Tips】可以通過定義回調函數將每個epoch的訓練誤差和測試誤差并作圖，如果訓練誤差曲線和測試誤差曲線之間有很大的空隙，說明你的模型可能有過擬合的問題。當然，這個問題與Keras無關。

如何獲取中間層的輸出？

一種簡單的方法是創建一個新的Model，使得它的輸出是你想要的那個輸出

from keras.models import Modelmodel = ... # create the original modellayer_name = 'my_layer' intermediate_layer_model = Model(input=model.input,output=model.get_layer(layer_name).output) intermediate_output = intermediate_layer_model.predict(data)

此外，我們也可以建立一個Keras的函數來達到這一目的：

from keras import backend as K# with a Sequential model get_3rd_layer_output = K.function([model.layers[0].input],[model.layers[3].output]) layer_output = get_3rd_layer_output([X])[0]

當然，我們也可以直接編寫Theano和TensorFlow的函數來完成這件事

注意，如果你的模型在訓練和測試兩種模式下不完全一致，例如你的模型中含有Dropout層，批規范化（BatchNormalization）層等組件，你需要在函數中傳遞一個learning_phase的標記，像這樣：

get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],[model.layers[3].output])# output in test mode = 0 layer_output = get_3rd_layer_output([X, 0])[0]# output in train mode = 1 layer_output = get_3rd_layer_output([X, 1])[0]

如何利用Keras處理超過機器內存的數據集？

可以使用model.train_on_batch(X,y)和model.test_on_batch(X,y)。請參考模型

另外，也可以編寫一個每次產生一個batch樣本的生成器函數，并調用model.fit_generator(data_generator, samples_per_epoch, nb_epoch)進行訓練

這種方式在Keras代碼包的example文件夾下CIFAR10例子里有示范，也可點擊這里在github上瀏覽。

當驗證集的loss不再下降時，如何中斷訓練？

可以定義EarlyStopping來提前終止訓練

from keras.callbacks import EarlyStopping early_stopping = EarlyStopping(monitor='val_loss', patience=2) model.fit(X, y, validation_split=0.2, callbacks=[early_stopping])

請參考回調函數

驗證集是如何從訓練集中分割出來的？

如果在model.fit中設置validation_spilt的值，則可將數據分為訓練集和驗證集，例如，設置該值為0.1，則訓練集的最后10%數據將作為驗證集，設置其他數字同理。注意，原數據在進行驗證集分割前并沒有被shuffle，所以這里的驗證集嚴格的就是你輸入數據最末的x%。

訓練數據在訓練時會被隨機洗亂嗎？

是的，如果model.fit的shuffle參數為真，訓練的數據就會被隨機洗亂。不設置時默認為真。訓練數據會在每個epoch的訓練中都重新洗亂一次。

驗證集的數據不會被洗亂

如何在每個epoch后記錄訓練/測試的loss和正確率？

model.fit在運行結束后返回一個History對象，其中含有的history屬性包含了訓練過程中損失函數的值以及其他度量指標。

hist = model.fit(X, y, validation_split=0.2) print(hist.history)

如何使用狀態RNN（stateful RNN）？

一個RNN是狀態RNN，意味著訓練時每個batch的狀態都會被重用于初始化下一個batch的初始狀態。

當使用狀態RNN時，有如下假設

所有的batch都具有相同數目的樣本如果X1和X2是兩個相鄰的batch，那么對于任何i，X2[i]都是X1[i]的后續序列

要使用狀態RNN，我們需要

顯式的指定每個batch的大小。可以通過模型的首層參數batch_input_shape來完成。batch_input_shape是一個整數tuple，例如(32,10,16)代表一個具有10個時間步，每步向量長為16，每32個樣本構成一個batch的輸入數據格式。在RNN層中，設置stateful=True

要重置網絡的狀態，使用：

model.reset_states()來重置網絡中所有層的狀態layer.reset_states()來重置指定層的狀態

例子：

X # this is our input data, of shape (32, 21, 16) # we will feed it to our model in sequences of length 10model = Sequential() model.add(LSTM(32, input_shape=(10, 16), batch_size=32, stateful=True)) model.add(Dense(16, activation='softmax'))model.compile(optimizer='rmsprop', loss='categorical_crossentropy')# we train the network to predict the 11th timestep given the first 10: model.train_on_batch(X[:, :10, :], np.reshape(X[:, 10, :], (32, 16)))# the state of the network has changed. We can feed the follow-up sequences: model.train_on_batch(X[:, 10:20, :], np.reshape(X[:, 20, :], (32, 16)))# let's reset the states of the LSTM layer: model.reset_states()# another way to do it in this case: model.layers[0].reset_states()

注意，predict，fit，train_on_batch ，predict_classes等方法都會更新模型中狀態層的狀態。這使得你不但可以進行狀態網絡的訓練，也可以進行狀態網絡的預測。

如何“凍結”網絡的層？

“凍結”一個層指的是該層將不參加網絡訓練，即該層的權重永不會更新。在進行fine-tune時我們經常會需要這項操作。在使用固定的embedding層處理文本輸入時，也需要這個技術。

可以通過向層的構造函數傳遞trainable參數來指定一個層是不是可訓練的，如：

frozen_layer = Dense(32,trainable=False)

此外，也可以通過將層對象的trainable屬性設為True或False來為已經搭建好的模型設置要凍結的層。在設置完后，需要運行compile來使設置生效，例如：

x = Input(shape=(32,)) layer = Dense(32) layer.trainable = False y = layer(x)frozen_model = Model(x, y) # in the model below, the weights of `layer` will not be updated during training frozen_model.compile(optimizer='rmsprop', loss='mse')layer.trainable = True trainable_model = Model(x, y) # with this model the weights of the layer will be updated during training # (which will also affect the above model since it uses the same layer instance) trainable_model.compile(optimizer='rmsprop', loss='mse')frozen_model.fit(data, labels) # this does NOT update the weights of `layer` trainable_model.fit(data, labels) # this updates the weights of `layer`

如何從Sequential模型中去除一個層？

可以通過調用.pop()來去除模型的最后一個層，反復調用n次即可去除模型后面的n個層

model = Sequential() model.add(Dense(32, activation='relu', input_dim=784)) model.add(Dense(32, activation='relu'))print(len(model.layers)) # "2"model.pop() print(len(model.layers)) # "1"

如何在Keras中使用預訓練的模型？

我們提供了下面這些圖像分類的模型代碼及預訓練權重：

VGG16VGG19ResNet50Inception v3

可通過keras.applications載入這些模型：

from keras.applications.vgg16 import VGG16 from keras.applications.vgg19 import VGG19 from keras.applications.resnet50 import ResNet50 from keras.applications.inception_v3 import InceptionV3model = VGG16(weights='imagenet', include_top=True)

這些代碼的使用示例請參考.Application模型的文檔

使用這些預訓練模型進行特征抽取或fine-tune的例子可以參考此博客

VGG模型也是很多Keras例子的基礎模型，如：

Style-transferFeature visualizationDeep dream

如何在Keras中使用HDF5輸入？

你可以使用keras.utils中的HDF5Matrix類來讀取HDF5輸入，參考這里

可以直接使用HDF5數據庫，示例

import h5py with h5py.File('input/file.hdf5', 'r') as f:X_data = f['X_data']model.predict(X_data)

Keras的配置文件存儲在哪里？

所有的Keras數據默認存儲在：

$HOME/.keras/

對windows用戶而言，$HOME應替換為%USERPROFILE%

當Keras無法在上面的位置創建文件夾時（例如由于權限原因），備用的地址是/tmp/.keras/

Keras配置文件為JSON格式的文件，保存在$HOME/.keras/keras.json。默認的配置文件長這樣：

{"image_data_format": "channels_last","epsilon": 1e-07,"floatx": "float32","backend": "tensorflow" }

該文件包含下列字段：

默認的圖像數據格式channels_last或channels_first用于防止除零錯誤的epsilon默認的浮點數類型默認的后端

類似的，緩存的數據集文件，即由get_file()下載的文件，默認保存在$HOME/.keras/datasets/

在使用Keras開發過程中，我如何獲得可復現的結果?

在開發模型中，有時取得可復現的結果是很有用的。例如，這可以幫助我們定位模型性能的改變是由模型本身引起的還是由于數據上的變化引起的。下面的代碼展示了如何獲得可復現的結果，該代碼基于Python3的tensorflow后端

import numpy as np import tensorflow as tf import random as rn# The below is necessary in Python 3.2.3 onwards to # have reproducible behavior for certain hash-based operations. # See these references for further details: # https://docs.python.org/3.4/using/cmdline.html#envvar-PYTHONHASHSEED # https://github.com/fchollet/keras/issues/2280#issuecomment-306959926import os os.environ['PYTHONHASHSEED'] = '0'# The below is necessary for starting Numpy generated random numbers # in a well-defined initial state.np.random.seed(42)# The below is necessary for starting core Python generated random numbers # in a well-defined state.rn.seed(12345)# Force TensorFlow to use single thread. # Multiple threads are a potential source of # non-reproducible results. # For further details, see: https://stackoverflow.com/questions/42022950/which-seeds-have-to-be-set-where-to-realize-100-reproducibility-of-training-ressession_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)from keras import backend as K# The below tf.set_random_seed() will make random number generation # in the TensorFlow backend have a well-defined initial state. # For further details, see: https://www.tensorflow.org/api_docs/python/tf/set_random_seedtf.set_random_seed(1234)sess = tf.Session(graph=tf.get_default_graph(), config=session_conf) K.set_session(sess)# Rest of code follows ...

總結

以上是生活随笔為你收集整理的学习笔记 Keras:常见问题的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：转载：matlab 字符串和变量名互换
下一篇： java akka 实战_Akka实战：