DeepFM算法
一:背景與特點(diǎn)
之前為了同時(shí)學(xué)習(xí)低階和高階組合特征,提出了?Wide&Deep 模型。它混合了一個(gè)?線性模型(Wide part)?和?Deep 模(Deep part)。這兩部分模型需要不同的輸入,而?Wide part?部分的輸入,依舊?依賴人工特征工程。但是,這些模型普遍都存在兩個(gè)問題:
于是DeepFM 應(yīng)運(yùn)而生,成功解決了這兩個(gè)問題,并做了一些改進(jìn),其優(yōu)點(diǎn)如下:
二:模型結(jié)構(gòu)
DeepFM主要做法如下:
需要訓(xùn)練的主要有兩部分:
整體結(jié)構(gòu)圖如下所示:
由上面網(wǎng)絡(luò)結(jié)構(gòu)圖可以看到,DeepFM 包括 FM和 DNN兩部分,所以模型最終的輸出也由這兩部分組成:
下面,把結(jié)構(gòu)圖進(jìn)行拆分,分別來看這兩部分。
2.1:FM Component
FM 部分的輸出如下:
這里需要注意兩點(diǎn):
FM Component 總結(jié):
2.2 :Deep Component
這里DNN的作用是構(gòu)造高階組合特征,網(wǎng)絡(luò)里面黑色的線是全連接層,參數(shù)需要神經(jīng)網(wǎng)絡(luò)去學(xué)習(xí)。且有一個(gè)特點(diǎn):DNN的輸入也是embedding vector。所謂的權(quán)值共享指的就是這里。
這里假設(shè)α(0)=(e1,e2,...em)表示 embedding層的輸出,那么α(0)作為下一層 DNN隱藏層的輸入,其前饋過程如下:
三:總結(jié)
DeepFM優(yōu)點(diǎn):
其中最核心的:
四:代碼核心部分
# ---------- first order term ----------self.y_first_order = tf.nn.embedding_lookup(self.weights["feature_bias"], self.feat_index) # None * F * 1self.y_first_order = tf.reduce_sum(tf.multiply(self.y_first_order, feat_value), 2) # None * Fself.y_first_order = tf.nn.dropout(self.y_first_order, self.dropout_keep_fm[0]) # None * F# ---------- second order term ---------------# sum_square partself.summed_features_emb = tf.reduce_sum(self.embeddings, 1) # None * Kself.summed_features_emb_square = tf.square(self.summed_features_emb) # None * K# square_sum partself.squared_features_emb = tf.square(self.embeddings)self.squared_sum_features_emb = tf.reduce_sum(self.squared_features_emb, 1) # None * K# second orderself.y_second_order = 0.5 * tf.subtract(self.summed_features_emb_square,self.squared_sum_features_emb) # None * Kself.y_second_order = tf.nn.dropout(self.y_second_order, self.dropout_keep_fm[1]) # None * K# ---------- Deep component ----------self.y_deep = tf.reshape(self.embeddings, shape=[-1, self.field_size * self.embedding_size]) # None * (F*K)self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[0])for i in range(0, len(self.deep_layers)):self.y_deep = tf.add(tf.matmul(self.y_deep, self.weights["layer_%d" % i]),self.weights["bias_%d" % i]) # None * layer[i] * 1if self.batch_norm:self.y_deep = self.batch_norm_layer(self.y_deep, train_phase=self.train_phase,scope_bn="bn_%d" % i) # None * layer[i] * 1self.y_deep = self.deep_layers_activation(self.y_deep)self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[1 + i]) # dropout at each Deep layer# ---------- DeepFM ----------if self.use_fm and self.use_deep:concat_input = tf.concat([self.y_first_order, self.y_second_order, self.y_deep], axis=1)elif self.use_fm:concat_input = tf.concat([self.y_first_order, self.y_second_order], axis=1)elif self.use_deep:concat_input = self.y_deepself.out = tf.add(tf.matmul(concat_input, self.weights["concat_projection"]), self.weights["concat_bias"],name="output")?
總結(jié)
- 上一篇: Centos7下的LibreOffice
- 下一篇: itext 导出word