【论文翻译】Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing
Abstract
由于人臉認證系統的高安全性需求,面部反欺騙(a.k.a演示攻擊檢測)已引起越來越多的關注。當訓練和測試欺騙樣本擁有相似的模式時,現有的基于CNN的方法通常很好地識別欺騙攻擊,但它們的性能會在未知場景的測試欺騙攻擊上急劇下降。在本文中,我們試圖通過設計兩個新穎性的CNN模型來提高方法的泛化能力和適用性。首先,我們針對CNN模型提出了一種簡單但有效的總成對混淆(TPC)損失函數,這增強了學習演示攻擊(PA)的通用性表現。其次,我們將快速域適應(FDA)組件納入CNN模型,以減輕數據在不同域的變化帶來的負面影響。此外,我們提出的模型,名為可推廣面部認證CNN(GFA-CNN),以多任務方式工作,同時執行面部反欺騙和面部識別。實驗結果表明,GFA-CNN優于以前的人臉反欺騙方法,并且很好地保留了輸入人臉圖像的身份信息。
?
?
Face anti-spoofing (a.k.a presentation attack detection) has drawn growing attention due to the high security de- mand in face authentication systems. Existing CNN-based approaches usually well recognize the spoofing faces when training and testing spoofing samples display similar pat- terns, but their performance would drop drastically on test- ing spoofing faces of unseen scenes. In this paper, we try to boost the generalizability and applicability of these methods by designing a CNN model with two major novelties. First, we propose a simple yet effective Total Pairwise Confusion (TPC) loss for CNN training, which enhances the general- izability of the learned Presentation Attack (PA) representa- tions. Secondly, we incorporate a Fast Domain Adaptation (FDA) component into the CNN model to alleviate negative effects brought by domain changes. Besides, our proposed model, which is named Generalizable Face Authentication CNN (GFA-CNN), works in a multi-task manner, perform- ing face anti-spoofing and face recognition simultaneously. Experimental results show that GFA-CNN outperforms pre- vious face anti-spoofing approaches and also well preserves the identity information of input face images.
?
1. Introduction
?
盡管近年取得了顯著進步,但人臉識別系統的安全性仍然容易受到打印照片或重放視頻的演示攻擊(PA)的影響。為了抵消PA,面部反欺騙[25,19]被開發出來并作為面部識別之前的一個步驟。
早期的面部反欺騙方法主要采用手工制作的功能,如LBP [8],HoG [16]和SURF [5],以找出現場和欺騙面孔之間的差異。在[27]中,CNN首次用于面部反欺騙,在數據庫內測試中取得了顯著的性能。在他們的工作之后,已經提出了許多基于CNN的方法,幾乎??所有方法都將面部反欺騙視為二元(實時與欺騙)分類問題。然而,考慮到CNN的巨大解決方案空間,這些方法往往會遭受過度擬合以及對新PA模式和環境的不良通用性。在這項工作中,我們試圖使反欺騙系統能夠部署在各種環境中,即具有良好的通用性。
?
?
Despite the recent noticeable advances, the security of face recognition systems is still vulnerable to Presentation Attacks (PA) with printed photos or replayed videos. To counteract PA, face anti-spoofing [25, 19] is developed and serves as a pre-step prior to face recognition.
Earlier face anti-spoofing approaches mainly adopt handcrafted features, like LBP [8], HoG [16] and SURF [5], to find the differences between live and spoofing faces. In [27], CNN was used for face anti-spoofing for the first time, with remarkable performance achieved in intra- database tests. Following their work, a number of CNN-based methods have been proposed, almost all treating face anti-spoofing as a binary (live vs. spoofing) classification problem. However, given the enormous solution space of CNN, these methods tend to suffer overfitting and poor gen- eralizability to new PA patterns and environments. In this work, we attempt to enable an anti-spoofing system to be deployed in various environments, i.e. with good generaliz- ability.
?
?
對于基于CNN的方法,區分現場與欺騙面部的重要線索是欺騙模式,包括顏色失真,莫爾圖案,形狀變形,欺騙偽像(例如反射)等。在CNN模型訓練期間 ,強大的模式會產生更多的貢獻,并且由此產生的模型對它們更具辨別力。 但是,如果測試數據中沒有這些模式,性能將嚴重下降。 基于CNN的方法傾向于過度學習某些強烈的欺騙模式,因此普遍性較差[19]。 除了過度擬合之外,數據的域轉移[18]也是面部反欺騙方法普遍性差的重要原因。 這里的域是指某個獲取圖像的環境,包括各種因素,如照明,背景,面部外觀,相機類型等。考慮到現實世界環境的巨大差異,不同的樣本具有不同的域是非常常見的。 例如,即使在相同面部的情況下,如果用不同的紙片(例如光澤紙和粗糙紙)再現,兩個紙張攻擊的域可能是完全不同的。 這種域差異可能導致特征空間中不同樣本的分布不相似,并導致模型在新域上失敗。
?
?
For CNN-based methods, an important clue to differen- tiate live vs. spoofing faces is the spoof pattern, including color distortion, moir ?e pattern, shape deformation, spoofing artifacts (e.g., reflection), etc. During CNN model train- ing, strong patterns make more contributions, and the resultant model is more discriminative for them. However, if these patterns are absent in the testing data, the performance would severely drop. The CNN-based methods tend to over- fit to some strong spoof patterns and thus suffer poor generalizability [19]. Apart from overfitting, domain shift [18] is also an important reason for the poor generalizability of face anti-spoofing methods. A domain here refers to a certain environment where an image is acquisited, consisting of various factors such as illumination, background, facial appearance, camera type, etc. Considering the huge diver- sity of real world environments, it is very common that dif- ferent samples have different domains. For example, the domains of two paper attacks may be quite different even in case of the same face if reproduced with different pieces of paper (e.g. glossy vs. rough paper). Such domain variance may lead to distribution dissimilarity of different samples in the feature space and cause the models to fail on new domains.
?
?
圖1:我們的CNN框架以多任務方式工作,一次性解決人臉識別和面部反欺騙問題。 它利用完全成對混淆(TPC)丟失和快速域適應(FDA)來增強學習的演示攻擊(PA)功能的通用性,并改善不同場景中的面部反欺騙性能。
?
?
基于上述觀察,我們提出了一種新的整體成對混合(TPC)損失,以平衡所有相關欺騙模式的影響,并采用快速域適應(FDA)模型[11]來縮小特征空間域中不同樣本的分布差異。? 然后我們獲得一個廣義面部認證CNN模型,簡稱為GFA-CNN。 與現有的面部反欺騙方法不同,我們的GFA-CNN以多任務方式工作,同時進行面部反欺騙和人臉識別,如圖1所示。 這兩個任務的CNN層共享相同的參數,我們的模型工作效率很高。針對面部反欺騙的五個流行基準的廣泛實驗證明了我們的方法優于現有技術的優勢。 我們的代碼和經過訓練的模型將在論文接受后提供。 我們的貢獻總結如下:
?
?
Based on the above observations, we propose a new Total Pairwise Confusion (TPC) loss to balance the contributions of all involved spoof patterns, and also employ a Fast Domain Adaptation (FDA) model [11] to narrow the distribution discrepancy of samples from different domains in the feature space. We then obtain a Generalizable Face Authentication CNN model, shorted as GFA-CNN. Different from prior methods that take face anti-spoofing as a pre-step of face authentication, our GFA-CNN works in a multi-task manner, performing simultaneously face anti-spoofing and face recognition, as shown in Fig. 1. Since the CNN layers of the two tasks share the same parameters, our model works with high efficiency.
Extensive experiments on five popular benchmarks for face anti-spoofing demonstrate the superiority of our method over the state-of-the-arts. Our code and trained models will be available upon acceptance. Our contribu- tions are summarized as follows:
?
?
我們提出了總成對混淆(TPC)損失,以有效地減輕基于CNN的面部反欺騙模型的過度擬合問題到數據集特定的欺騙模式,這改善了面部反欺騙方法的普遍性。
?我們采用快速域適應(FDA)模型來學習更強大的演示攻擊(PA)表示,從而減少特征空間中的域移位。
?我們開發了面向身份驗證的多任務CNN模型。 我們的GFA-CNN同時執行反欺騙和面部識別。
?
?
We propose a Total Pairwise Confusion (TPC) loss to effectively relieve the overfitting problems of CNN- based face anti-spoofing models to dataset-specific spoof patterns, which improves generalizability of face anti-spoofing methods.
? We incorporate the Fast Domain Adaptation (FDA) model to learn more robust Presentation Attack (PA) representations, which reduces domain shift in the feaure space.
? We develop a multi-task CNN model for face authentication. Our GFA-CNN performs jointly face anti- spoofing and face recognition.
圖2:擬議的GFA-CNN的架構。 整個網絡包含兩個分支。 面部反欺騙分支(上部)將由FDA轉換的域自適應圖像作為輸入,并通過TPC損失和防欺騙損失進行優化,面部識別分支(底部)將裁剪的面部圖像作為輸入并通過 最大限度地減少Recog損失。 結構設置顯示在每個塊的頂部,其中“ID號”表示參與訓練中的分類數。 兩個分支在訓練期間共享參數。
?
?
大多數先前的面部反欺騙方法利用預定義的特征(例如LBP [8],HoG [16]和SURF [5])利用實時和欺騙面部之間的紋理差異,隨后將其饋送到監督分類器(例如, SVM,LDA)用于二進制分類。 然而,這種手工制作的特征對不同的照明條件,相機設備,特定身份等非常敏感。盡管在內部數據集協議下實現了顯著的性能,但來自不同環境的樣本可能使模型失敗。 為了獲得具有更好的可普遍性的特征,一些方法利用時間信息,例如, 利用活體面部的自發動作,如眨眼[20]和嘴唇運動[15]。 雖然這些方法對于照片攻擊是有效的,但是當攻擊者通過切割眼睛/嘴巴位置的紙張模擬這些動作時,它們會變得脆弱。
?
Most previous approaches for face anti-spoofing exploit texture differences between live and spoofing faces with pre-defined features such as LBP [8], HoG [16], and SURF [5], which are subsequently fed to a supervised classifier (e.g., SVM, LDA) for binary classification. However, such handcrafted features are very sensitive to different illumination conditions, camera devices, specific identities, etc. Though noticeable performance achieved under the intradataset protocol, the sample from a different environment may fail the model. In order to obtain features with better generalizability, some approaches leverage temporal information, e.g. making use of the spontaneous motions of the live faces, such as eye-blinking [20] and lip motion [15]. Though these methods are effective against photo attacks, they become vulnerable when attackers simulate these mo- tions through a paper with eye/mouth positions cut.
?
?
最近,已經提出了基于深度學習的方法[27,17]來解決面部反欺騙。 他們通過將面部反欺騙作為二元分類問題來使用CNN來學習高度辨別力的表示。 然而,他們中的大多數容易遭受過度擬合。 目前公開的面部反欺騙數據集太有限,無法涵蓋各種潛在的欺騙類型。 Liu等人最近的一項工作[19]。 利用深度圖和rPPG信號作為輔助監督來訓練CNN,而不是將面部反欺騙視為簡單的二元分類問題,以避免過度擬合。 面部反欺騙的另一個關鍵問題是域名轉移。 為了彌補訓練和測試域之間的差距,[17]通過最小化跨域特征分布的不相似性,即最小化表示之間的最大平均差異距離,將CNN推廣到未知條件。
?
?
Recently, deep learning based methods [27, 17] have been proposed to address face anti-spoofing. They use CNNs to learn highly discriminative representations by taking face anti-spoofing as a binary classification problem. However, most of them easily suffer overfitting. Current publicly available face anti-spoofing datasets are too limted to cover various potential spoofing types. A very recent work [19] by Liu et al. leverages the depth map and rPPG signal as auxiliary supervision to train CNN instead of treating face anti-spoofing as a simple binary classification problem in order to avoid overfitting. Another critical issue for face anti-spoofing is domain shift. To bridge the gap between training and testing domains, [17] generalizes CNN to unknown conditions by minimizing the feature distribution dissimilarity across domains, i.e. minimizing the Maximum Mean Discrepancy distance among representations.
?
?
圖3:學習的特征分布w /和w/o Ltpc的可視化比較。 沒有Ltpc,特征分布是多樣的和特定于人的(左),而對于Ltpc,特征分布變得緊湊和均勻(右)。 l是分類超平面。 從顏色角度易于看清。
?
據我們所知,幾乎所有以前的作品都將面部反欺騙作為面部識別前的一個步驟,并將其作為二元分類問題來解決。 與以往的文獻相比,我們一舉解決了面部反欺騙和人臉識別問題。 與我們最相關的工作是[23],它提出了一個雙層框架,以確保用戶對識別系統的真實性,即通過生物識別系統監控用戶是否作為活體亦或是欺騙攻擊。 這個系統執行基于指紋,手掌靜脈打印,面部等的認證,具有兩個分離的層:反欺騙由CNN賦能學習表征,而識別人臉是基于諸如ORB點的預定義手工特征。
?
?
To our best knowledge, almost all previous works take face anti-spoofing as a pre-step prior to face recognition and address it as a binary classification problem. Compared with previous literature, we solve face anti-spoofing and face recognition at one shot. A most related work to ours is [23], which proposed a two-tier framework to ensure the authenticity of the user to the recognition system, namely, monitoring whether the user has passed the biometric system as a live or spoofing one. It performs authentication based on fingerprint, palm vein print, face, etc., with two separated tiers: the anti-spoofing is powered by CNN learned representations while the recognition is based on pre-defined handcrafted features like ORB points.
?
?
與[23]不同,我們以多任務方式構建我們的GFA-CNN,我們的框架可以識別給定面部的身份,同時判斷面部是活的還是欺騙性的。 值得一提的是,對于人臉識別,我們的方法在LFW數據庫中實現了高達97.1%的單模型精度[12],這甚至可以與現有技術相媲美(97.1%的精度不值得夸耀,有點托大,by 譯者注)。
?
?
Different with [23], we build our GFA-CNN in a multi- task manner, our framework can recognize the identity of a given face, and meanwhile judge whether the face is a live or spoofing one. It is worth mentioning that for face recognition, our method achieves single-model accuracy up to 97.1% on the LFW database [12], which is even compa- rable to state-of-the-arts.
?
?
3. Generalizable Face Authentication CNN
?3.1. Multi-Task Network Architecture
所提出的普適面部認證-CNN(GFA-CNN)能夠以相互提升的方式共同解決人臉識別和人臉反欺騙。 該網絡有兩個分支:面部反欺騙分支和人臉識別分支。 每個分支由5個CNN層和3個完全連接(FC)層組成,每個塊包含3個CNN層。 參數在這兩個分支之間共享。 通過最小化TPC損失和人臉反欺騙損失(防欺騙損失)訓練面部反欺騙分支,同時通過優化面部識別損失(Recg-loss)來訓練面部識別分支。 反欺騙分支采用帶背景的人臉作為輸入原始圖像,而識別分支采用裁剪的人臉作為輸入。 在輸入到面部反欺騙分支之前,通過給定的目標域圖像將訓練圖像遷移到目標域。 在測試階段,每個查詢圖像都會遷移到目標域,然后傳播到網絡中。
?
?
The proposed Generalizable Face Authentication CNN (GFA-CNN) is able to jointly address face recognition and face anti-spoofing in a mutual boosting way. The network has two branches: the face anti-spoofing branch and the face recognition branch. Each branch consists of 5 blocks of CNN layers and 3 fully connected (FC) layers, and each block contains 3 CNN layers. The parameters are shared between these two branches. The face anti-spoofing branch is trained by minimizing TPC loss and face anti-spoofing loss (Anti-loss), while the face recognition branch is trained by optimizing face recognition loss (Recg-loss). The anti- spoofing branch takes as input raw face images with back- ground, while the recognition branch takes cropped faces as input. Before fed to the face anti-spoofing branch, the train- ing images are transferred to a target domain by a given target-domain image. In testing phase, each query image is transferred to the target domain and then propagated for- ward the network.
?
?
CNN模塊的結構與VGG16的結構部分相同。 在訓練之前,首先在VGG面部數據集上訓練CNN塊以獲得面部識別的基本權重。 除了最后一層FC層的輸出尺寸外,FC防欺騙和人臉識別分支的FC層具有相同的結構。 面部反欺騙分支對于最后一個FC層采用2維,而面部識別分支中最后一個FC層的尺寸取決于訓練中涉及的人臉類型數量。 總體目標損失函數是
L = Lanti +λ1* Lid +λ2* Ltpc,(1)
其中Lanti和Lrecg分別是面部反欺騙和人臉識別的交叉熵損失,Ltpc是總成對混淆(TPC)損失,λ1和λ2是兩個損失中的加權參數。
?
?
The CNN blocks are structured the same with the convo- lution part of VGG16. Before training, the CNN blocks are first trained on the VGG-face dataset to obtain fundamental weights for face recognition. The FC layers of face anti- spoofing and face recognition branches have the same struc- ture except for the output dimension of the last FC layer. The face anti-spoofing branch takes 2 dimensions for the last FC layer, while the dimensions of the last FC layer in the face recognition branch depend on the number of sub- jects involved in training. The overall objective function is
L = Lanti +λ1 ?Lid +λ2 ?Ltpc, (1)
where Lanti and Lrecg are the cross entropy losses for face anti-spoofing and face recognition respectively, Ltpc is the Total Pairwise Confusion (TPC) loss, and λ1 and λ2 are the weighting parameters among different losses.
?
?
3.2. Total Pairwise Confusion Loss
?
?
為了學習適應不同環境條件的演示攻擊(PA)表示,我們提出了一種新的總成對混淆(TPC)損失。 我們的靈感來自成對混淆(PC)損失[10],它通過故意在特征激活中引入混淆來解決細粒度視覺分類中的過度擬合問題。 我們修改他們的混淆實現,使其適用于面部反欺騙任務。 我們的TPC損失定義為:
其中xi和xj是兩個隨機選擇的圖像(樣本對),M是訓練中涉及的樣本對的總數,ψ(x)表示面部反欺騙分支的第二個全連接層的表示(見圖2)。
(譯者注: 論文【10】可參考https://blog.csdn.net/Jadelyw/article/details/82988498 個人感覺類似孿生網絡,或facenet的tripletloss的概念,區別是存在兩個fc層,在倒數第二層提取出特征進行歐式距離的計算)
?
?
In order to learn Presentation Attack (PA) representations that are adaptable to varying environment conditions, we propose a novel Total Pairwise Confusion (TPC) loss. Our inspiration comes from the pairwise confusion (PC) loss [10] that tackles the overfitting issue in fine-grained visual classification by intentionally introducing confusion in the feature activations. We modify their confusion imple- mentation to make it applicable to the face anti-spoofing task. Our TPC loss is defined as
where xi and xj are two randomly selected images (sample pair), M is the total number of sample pairs involved in training and ψ(x) denotes the representations of the second fully connected layer of the face anti-spoofing branch.
?
?
我們的Ltpc與原始PC損失的區別有兩點:1)TPC損失使來自訓練集的隨機樣本對的分布距離最小化,而不是來自兩個不同類別的樣本對,以迫使CNN學習更細微的判別性特征。 2)我們最小化特征空間中的歐幾里德距離,而原始PC損失為了使同一樣本對中的樣本具有相似的條件概率分布,最小化概率空間中的距離(softmax的輸出)。
?
?
Our Ltpc differs from the original PC loss in two-fold: 1) TPC loss minimizes the distribution distance of a random sample pair from the training set, rather than the sample pair from two different categories, to force CNN to learn slightly less discriminative features. 2) We minimize the Euclidean distance in the feature space while the original PC loss min- imizes the distance in the probability space (output of soft- max) to make samples in the same pair have a similar con- ditional probability distribution.
?
?
我們的修改基于以下考慮:1)將面部反欺騙視為二元分類時,跨類別的混淆不會過分影響PA功能在區分活體和欺騙樣本的可區分性。 2)相同類型相關的面部樣本通常會聚集在特征空間中,而對所有樣本實施混淆可以壓縮并均勻化整個特征分布(參見圖3),從而有利于泛化性能。 3)作為更簡單結構的二元分類問題,在特征空間內對模型進行正則化比在輸出概率空間內強制正則化更有用。
?
?
Our modifications are based on below considerations: 1) With face anti-spoofing taken as a binary classification is- sue, confusion across categories would not excessively af- fect the discriminability of the PA feature on differentiating live vs. spoofing samples. 2) Face samples related to the same subject would usually cluster in the feature space, and implementing confusion on all samples could compact and homogenize the whole feature distribution (see Fig. 3), thus benefiting generalization performance. 3) As a binary classification problem of simpler structure, regularizing the model within the feature space would be more useful than imposing regularization within the output probabilistic space.
?
?
圖4:SSF的貢獻平衡過程。 FC層中較暗的顏色表示對分類的貢獻較高,而較淺的顏色表示較低的貢獻。 每個網格代表一個SSF。 Ltpc和Lanti之間的權衡游戲可以平衡SSF對最終決策的貢獻。
Figure 4: The contribution-balanced process of SSFs. Darker color in the FC layer indicates a higher contribu- tion to the classification while lighter color indicates lower. Each grid represents an SSF. The trade-off game between Ltpc and Lanti can balance the contributions of SSFs to the final decision.
?
?
我們的Ltpc可以有效地提高其PA表示的普遍性。可以理解如下。假設PA表示中有K個分量,每個對應一個欺騙模式,稱為這項任務的欺騙模式特征(SSF)。如如圖4所示,不同的SSF對最終決定有不同程度的貢獻。如果我們將活體和欺騙樣本的特征分別定義為Fl =(f1l,f2l,...,fKl)和Fs =(f1s,f2s,...,fKs),其中fil是活體樣本的第i個SSF,fis是欺騙樣本的第i個SSF。 SSF基于他們對活體和欺騙分類的重要性排名。一方面,L anti為了更好的分類旨在擴大在F1和Fs之間距離。另一方面Ltpc試圖縮小F1和之間的差異Fs。f1l / s對于活體和欺騙樣本來說擁有最大的差異貢獻值,但它將受到Ltpc的最大削弱。而不太重要的SSF的貢獻,例如fK-1 l / s和fK? l / s ,將被L增強以抵消分類的損失。在這種權衡中,所有SSF的貢獻趨于均衡,這意味著決策中涉及更多的欺騙模式,而不僅僅是針對訓練集的一些強大的欺騙模式。這可以有效地緩解過度擬合風險。如果某些欺騙模式在測試中消失,那么其他模式仍然可以做出公平的決定,確保CNN不會過度適應某些特征。
?
?
Our Ltpc can effectively improve the generalizability of PA representations. This can be understood as follows. Suppose there are K components in the PA representations, each corresponding to one spoof pattern, which is called a Spoof-pattern Specific Feature (SSF) in this work. As shown in Fig. 4, different SSFs contribute differently to the final decision. If we define the feature for a live and a spoofing sample as Fl = (f1l ,f2l ,...,fKl ) and Fs = (f1s,f2s,...,fKs ), respectively, where fi is the ith SSF of the live sample and fi ls is the ith SSF of the spoofing sample. The SSFs are ranked based on their importance to the classification of live vs. spoofing. On one hand, Lanti aims to enlarge the distance between Fl and Fs for better discrimination. On the other hand, Ltpc attempts to narrow the difference between Fl and Fs. As f1l/s contributes the most to the differentiation of live and spoofing samples, it will be impaired the most by Ltpc. However, the contributions of less important SSFs, such as fK?1 and fK , will be enhanced by L to offset l/s l/s anti the impaired discriminative ability. In this trade-off game, the contributions of all SSFs tend to be equalized, meaning more spoof patterns are involved in the decision rather than just a couple of strong spoof patterns specific to the train- ing set. This could effectively alleviate overfitting risks. If some spoof patterns disappear in testing, a fair decision can still be achieved by other patterns, ensuring CNN would not overfit to some specific features.
?
?
3.3. Fast Domain Adaptation
?
?
除了提出的TPC損失平衡每種欺騙模式之外,我們還應用FDA來減少特征空間中的域遷移以進一步提高我們框架的可通用性。通常,圖像包含兩個組件:內容和外觀[21]。 外觀信息(例如,顏色,局部結構)構成來自特定領域的圖像的風格,并且主要由CNN的底層中的特征表示[13]。 對于面部反欺騙,面部樣本數據之間的域差異可能會在特征空間中引入分布差異并且會損害反欺騙性能。 在這里,我們使用FDA來減輕域變化帶來的負面影響。 FDA包括圖像變換網絡f(·),其從給定圖像x:y = f(x)生成合成圖像y,以及損失網絡φ(·),計算其內容重建損失L content 和域重建損失 L domain。
(譯者注:【21】可以參考 https://blog.csdn.net/z0n1l2/article/details/81677178? 以及
?https://blog.csdn.net/sunyao_123/article/details/81294724
)
Besides the proposed TPC loss that balances the contribution of each spoof pattern, we also apply FDA to reduce domain shift in the feature space to further improve the gen- eralizability of our framework.
Generally, an image contains two components: content and appearance [21]. The appearance information (e.g., colors, localised structures) makes up the style of images from a certain domain and is mostly represented by features in the bottom layers of CNN [13]. For face anti-spoofing, the domain variance among face samples may introduce the distribution dissimilarity in the feature space and hurt anti- spoofing performance. Here, we employ the FDA to alleviate negative effects brought by domain changes. The FDA consists of an image transformation network f(·) that generates a synthetic image y from a given image x: y = f (x), and a loss network φ(·) that computes content reconstruction loss L content and domain reconstruction loss L domain.
?
?
圖5:FDA的結果示例。 中間列中圖像的左上和右下圖像是預期要傳輸的目標域圖像。 奇數行的圖像來自MSU-MFSD; 偶數行的圖像來自Replay-Attack。
Figure 5: Example results by FDA. The upper left and bot- tom right images of the images in the middle column are the target-domain images expected to be transferred. Images of odd rows are from MSU-MFSD; images of even rows are from Replay-Attack.
?
?
設φj(·)為網絡φ(·)的第j層,其形狀為Cj×Hj×Wj。 當內容重構損失在輸入圖像y中偏離輸入x時,內容重建損失會對輸出圖像y進行懲罰。 因此,我們最小化y和x的特征表示之間的歐幾里德距離:
域重建損失使輸出圖像y與目標域圖像yd具有相同的域。 然后,我們最小化Y和yd的Gram矩陣之間差異的平方Frobenius范數:
通過將φj重新整形為矩陣κ,Gj =κκT/ CjHjWj來計算Gram矩陣。 然后通過求解以下目標函數生成最優圖像y ?:
其中P是網絡f(·)的最佳參數,x是內容圖像,y = f(x),yd是目標域圖像,λc和λs是標量。 通過求解方程(5),x被轉移到y,保留x的內容和yd的域。
圖5顯示了我們的一些域轉移樣本。 從訓練數據中采樣目標域圖像。 第4.2節中提供了關于w /和FDA之間的特征多樣性的詳細分析。
?
?
Let φj(·) be the jth layer of φ(·) with the shape of Cj × Hj × Wj. The content reconstruction loss penalizes the output image y when it deviates in content from the input x. We thus minimize the Euclidean distance between the feature representations of y and x:
The domain reconstruction loss enables the output image y to have the same domain with the target-domain image yd. We then minimize the squared Frobenius norm of the difference between the Gram matrices of y and yd:
The Gram matrix is computed by reshaping φj into a matrix κ, Gj = κκ T /CjHjWj. Then the optimal image y? is generated by solving the following objective function:
where P is the optimal parameters of network f (·), x is the content image, y = f(x), yd is the target-domain image, and λc and λs are scalars. By solving Eqn. (5), x is trans- ferred to y?, preserving the content of x with the domain of yd.
Fig. 5 shows some of our domain transferred samples. The target-domain image is sampled from the training data. Detailed analysis on the feature diversity between domains w/ and w/o FDA is provided in Sec. 4.2.
?
?
4. Experiments
4.1. Experimental Setup
Datasets. We evaluate GFA-CNN on five face anti- spoofing benchmarks: CASIA-FASD [28], Replay-Attack [8], MSU-MFSD [26], Oulu-NPU [7] and SiW [19]. CASIA-FASD and MSU-MFSD are small datasets, con- taining 50 and 35 subjects, respectively. Oulu-NPU and SiW are high-resolution databases published very recently. Oulu-NPU contains 4 testing protocols: Protocol 1 evalu- ates the environment condition variations; Protocol 2 exam- ines the influences of different spoofing mediums; Protocol 3 estimates the effects of different input cameras; Protocol 4 considers all the challenges above. We conduct intradatabase tests on MSU-MFSD and Oulu-NPU, respectively. Cross-database tests are performed between CASIA-FASD vs. Replay-Attack and MSU-MFSD vs. Replay-Attack, re- spectively. The face recognition performance is evaluated on SiW, which contains 165 subjects with large variations in poses, illumination, expressions (PIE), and different dis- tances from subject to camera. The LFW, the most widely used benchmark for face recognition, is also used to evalu- ate the face recognition performance.
?
數據集。我們在五個面部反欺騙基準上評估GFA-CNN:CASIA-FASD [28],Replay-Attack [8],MSU-MFSD [26],Oulu-NPU [7]和SiW [19]。 CASIA-FASD和MSU-MFSD是小型數據集,分別包含50和35個科目。 Oulu-NPU和SiW是最近出版的高分辨率數據庫。 Oulu-NPU包含4個測試協議:協議1評估環境條件的變化;協議2檢查了不同欺騙媒介的影響;協議3估計了不同輸入攝像機的影響;協議4包括了上述所有挑戰。我們分別對MSU-MFSD和Oulu-NPU進行數據庫內測試。跨數據庫測試分別在CASA-FASD 與 Replay-Attack 和 MSU-MFSD與Replay-Attack之間進行。人臉識別性能在SiW上進行評估,其中包含165個受試者,其姿勢,光照,表情(PIE)以及從受試者到相機的不同距離有很大差異。 LFW是最廣泛使用的人臉識別基準,也用于評估人臉識別性能。
?
?
實施細節。 我們使用TensorFlow [1]實現GFA-CNN。 使用Adam優化器,學習率從0.0003開始,每2000步后衰減一半。 批量大小設置為32公式(1)中的λ1和λ2 分別設定為0.1和2.5e-5。 所有實驗均根據數據集中提供的方案進行。 CNN層在VGG-face數據集上進行了預訓練[22]。 考慮數據平衡,我們使用水平和垂直翻轉對CASIA-FASD,MSU-MFSD和Replay-Attack訓練集中的活體樣本采用三倍,同時通過水平翻轉使SiW訓練集中的活體樣本加倍。
?
?
Implementation Details. The proposed GFA-CNN is implemented with TensorFlow [1]. We use Adam optimizer with a learning rate beginning at 0.0003 and decaying half after every 2,000 steps. The batch size is set as 32. λ1 and λ2 in Eqn. (1) are set as 0.1 and 2.5e?5, respectively. All experiments are performed according to the protocols provided in the datasets. The CNN layers are pre-trained on the VGG-face dataset [22]. For data balance, we triple the live samples in the training set of CASIA-FASD, MSU- MFSD and Replay-Attack with horizontal and vertical flip- ping, while doubling the live samples in the training set of SiW by just flipping horizontally.
?
?
評估指標。 我們有兩個評估協議,即測試內和交叉測試,分別測試來自訓練集領域的樣本,而不是來自訓練集領域的樣本。 我們使用以下指標報告結果。
測試內評估:等錯誤率(EER),攻擊呈現分類錯誤率(APCER),善意呈現分類錯誤率(BPCER)和ACER =(APCER + BPCER)/ 2。 交叉測試評估:HTER。
?
?
Evaluation Metrics. We have two evaluation protocols, intra-test and cross-test, which test samples from and not from the domain of the training set, respectively. We report our results with the following metrics. Intra-test evaluation:
Equal Error Rate (EER), Attack Presentation Classification Error Rate (APCER), Bona Fide Presentation Classification Error Rate (BPCER) and, ACER=(APCER+BPCER)/2. Cross-test evaluation: HTER.
?
?
表1:消融研究(HTER%)。 +“表示使用相應的組件,而 - ”表示刪除組件。 粗體數字是最好的結果。
Table 1: Ablation study (HTER %). +” means the corre- sponding component is used, while -” indicates removing the component. The numbers in bold are the best results.
?
4.2. Ablation Study
(譯者注:ablation study就是為了研究模型中所提出的一些結構是否有效而設計的實驗。比如你提出了某某結構,但是要想確定這個結構是否有利于最終的效果,那就要將去掉該結構的網絡與加上該結構的網絡所得到的結果進行對比,這就是ablation study。說白了,ablation study就是一個模型簡化測試,看看取消掉模塊后性能有沒有影響。根據奧卡姆剃刀法則,簡單和復雜的方法能夠達到一樣的效果,那么簡單的方法更好更可靠。)
我們進行消融分析,以揭示TPC損失和FDA在我們的框架中的作用。我們通過添加/消除TPC和FDA來重新訓練建議的網絡。如表格所示。 1,如果TPC被移除,MFSD的內部測試HTER分別下降2.9%(w / FDA)和4.1%(不含FDA)。由于Replay-Attack通常沒有嚴重的過度擬合,因此在HTER上使用FDA,0.3%(w / FDA)和0.6%(不含FDA)時,改善的性能并不顯著是合理的。
對于交叉測試,如果TPC被消融,對于MFSD→Replay1,HTER顯著降低超過10%,對于Replay→MFSD,無論是否使用FDA,超過8%。通過使用TPC和FDA實現了最佳的交叉測試結果,表明FDA可以進一步提高所提出方法的普遍性。
為了評估w /和不含FDA的域之間的特征多樣性,我們通過對稱KL分歧來計算特征差異。類似于[21],我們將來自CNN的特征嵌入的信道的平均值表示為F.給定F的高斯分布,具有平均μ和方差σ2,域A和B之間的該信道的對稱KL發散是:
將D(FiA || FiB)表示為第i個通道的對稱KL散度。 然后將層的平均特征發散定義為
其中C是該層的通道號。 該度量測量域A和B的特征分布之間的距離。我們計算CNN模型中每個層的特征偏差以進行比較。 特別是,我們分別從MSU-MFSD和Replay-Attack中隨機選擇5,000個面部樣本。 每個數據集都被視為一個域。 然后將這些樣品送入預先訓練的VGG16 [24]模型,以計算Eqn后各層的KL散度。(8)。 比較結果如圖6所示。可以看出,在FDA中,MSU-MFSD與Replay-Attack之間的特征差異顯著減小。
?
圖6:MSU-MSFD和重放攻擊之間的特征差異比較。 x軸上的數字對應于VGG16的CNN層。
Figure 6: Feature divergence comparison between MSU- MFSD and Replay-Attack. The numbers on x-axis corre- spond to the CNN layer of VGG16.
?
We perform ablation analysis to reveal the role of TPC loss and FDA in our framework. We retrain the proposed network by adding/ablating TPC and FDA. As shown in Tab. 1, if TPC is removed, the HTER of intra-test on MFSD drops by 2.9% (w/ FDA) and 4.1% (w/o FDA), respectively. Since Replay-Attack is usually free of severe overfitting, it is reasonable to see the improved performance is not sig- nificant when using FDA, 0.3% (w/ FDA) and 0.6% (w/o FDA) on HTER.
For cross-test, if TPC is ablated, the HTER dramatically decreases by over 10% for MFSD → Replay1, and over 8% for Replay → MFSD, no matter FDA is used or not. The best cross-test result is achieved by using both TPC and FDA, indicating FDA can further improve the generalizabil- ity of the proposed method.
To evaluate the feature diversity between domains w/ and w/o FDA, we calculate the feature divergence via symmet- ric KL divergence. Similar to [21], we denote the mean value of a channel from the feature embedding of CNN as F. Given a Gaussian distribution of F, with mean μ and variance σ2, the symmetric KL divergence of this channel between domain A and B is
Denote D(FiA||FiB) as the symmetric KL divergence of the ith channel. Then the average feature divergence of the layer is defined as
where C is the channel number of this layer. This metric measures the distance between the feature distributions of domain A and B. We calculate the feature divergence of each layer in a CNN model for comparison. In particular, we randomly select 5,000 face samples from MSU-MFSD and Replay-Attack, respectively. Each dataset is considered as one domain. These samples are then fed to a pre-trained VGG16 [24] model to calculate the KL divergence at each layer following Eqn. (8). The comparison results are shown in Fig. 6. As can be seen, with the FDA, the feature diver- gence between MSU-MFSD and Replay-Attack is signifi- cantly reduced.
?
4.3. Face Anti-spoofing Evaluation
內部測試。我們對MSU-MFSD和Oulu-NPU進行內部測試。表2顯示了我們的方法與MSU-MFSD上其他最先進方法的比較。對于Oulu-NPU,我們參考[2]中的面部反欺騙競賽結果,并使用每個協議中最好的兩個進行比較。所有結果均在表 3。
如表2所示,GFA-CNN達到7.5%的EER,在所有比較方法中排名第3。考慮到GFA-CNN不是為了在測試內設置中追求高性能而盲目設計,這個結果是令人滿意的。在我們的實驗中,我們發現所提出的TPC損失可能會略微降低測試內性能,主要是因為TPC損失會損害訓練數據集中幾個最強SSF的貢獻。這些數據集特定功能的削弱可能反過來影響測試內的性能(但是,它們可能會提高交叉測試的性能)。根據表3,我們的方法在4個協議中的3個中實現了最低的ACER。對于最具挑戰性的協議4,我們實現了8.9%的ACER,比最佳表現低1.1%。
?
Intra-Test. We perform intra-test on MSU-MFSD and Oulu-NPU. Tab. 2 shows the comparisons of our method with other state-of-the-art methods on MSU-MFSD. For Oulu-NPU, we refer to the face anti-spoofing competition results in [2] and use the best two for each protocol for com- parison. All results are reported in Tab. 3.
As shown in Tab. 2, GFA-CNN achieves the EER of 7.5%, ranking the 3rd among all the compared methods. This result is satisfactory considering GFA-CNN is not de- signed blindly to pursue high performance in the intra- test setting. In our experiments, we find the proposed TPC loss may slightly decrease the intra-test performance, mainly because TPC loss impairs the contributions of several strongest SSFs w.r.t the training datasets. The weaken- ing of these dataset-specific features may in turn affect the intra-test performance (however, they may improve the per- formance in cross-test). According to Tab. 3, our method achieves the lowest ACER in 3 out of 4 protocols. For the most challenging protocol 4, we achieve the ACER of 8.9%, which is 1.1% lower than the best performer.
?
?
交叉測試。為了證明GFA-CNN的強大普遍性,我們通過與其他現有技術進行比較,對CASIA-FASD,Replay-Attack和MSU-MFSD進行交叉測試。我們采用最廣泛使用的交叉測試設置:CASIA-FASD vs. Replay-Attack和MSU-MFSD vs. Replay-Attack,并在Tab中報告比較結果。 表4可以看出,GFA-CNN在交叉測試中達到了最低的HTER:CASIA→重播,MFSD→重播和重播→MFSD。特別是對于重播→MFSD,與最好的最先進技術相比,GFA-CNN將交叉測試HTER降低了8.3%。然而,我們也觀察到GFA-CNN與Replay Attack→CASIA-FASD的最佳方法相比具有相對更差的HTER。這可能是由于當要傳輸的源域圖像的分辨率遠高于目標域圖像的分辨率時,FDA的“質量下降”。在Replay-Attack→CASIA-FASD的交叉測試中,目標域圖像選自Replay-Attack,低分辨率為320×240。但是,CASIA-FASD包含大量高分辨率圖像這種“分辨率差距”導致FDA的“質量下降”,如圖7中最右邊的圖像所示。
?
?
Cross-Test. To demonstrate the strong generalizability of GFA-CNN, we perform cross-test on CASIA-FASD, Replay-Attack, and MSU-MFSD by comparing with other state-of-the-arts. We adopt the most widely used cross- test settings: CASIA-FASD vs. Replay-Attack and MSU- MFSD vs. Replay-Attack, and report comparison results in Tab. 4. As can be seen, GFA-CNN achieves the lowest HTERs in cross-test: CASIA → Replay, MFSD → Replay and Replay → MFSD. Especially for Replay → MFSD, GFA-CNN reduces the cross-testing HTER by 8.3% com- pared with the best state-of-the-art.
However, we also observe GFA-CNN has a relatively worse HTER compared with the best method on Replay At- tack → CASIA-FASD. This is probably due to the “qual- ity degradation” by FDA when the resolution of a source- domain image to be transferred is much higher than that of the target-domain image. During the cross-testing on Replay-Attack → CASIA-FASD, the target-domain image is selected from Replay-Attack with a low-resolution of 320 × 240. However, CASIA-FASD contains quite a number of images with high-resolution of 720 × 1280. Such a “resolution gap” leads to a “quality degradation” of FDA, as shown in the rightmost image in Fig. 7.
?
?
表4:CASIA-FASD,重播攻擊和MSU-MFSD的交叉測試結果(HTER%)。 - “表示相應的結果未提供用。 粗體數字是最好的結果。
Table 4: Cross-test results (HTER %) on CASIA-FASD, Replay-Attack, and MSU-MFSD. -” indicates the corresponding result is unavailable. The numbers in bold are the best results.
?
圖7:FDA以不同的分辨率轉移的結果。 左上角圖像是目標域圖像。 對于每個塊的其他圖像,左邊的圖像是原始圖像,右邊是轉移的圖像。 位于每個圖像左上角的綠色數字表示分辨率。
Figure 7: Results transferred by FDA with different res- olutions. The top left image is the target-domain image. For other images of each block, the left one is the original image, and the right is the transferred image. The green number located at the top left of each image indicates the resolution.
4.4. Face Recognition Evaluation
我們進一步評估了我們的GFA-CNN在SiW和LFW上的人臉識別性能。由于我們的方法不是專門針對人臉識別,我們只采用VGG-16作為基線。在LFW上,我們遵循提供的協議來執行測試。在SiW上,我們使用90個科目進行訓練,另外75個科目進行測試,這是它的默認數據分割。該數據集還提供與每個主題相對應的正面遺留面部圖像。在測試階段,我們選擇測試集的每個主題作為圖庫面的遺留圖像,并使用測試集中的所有圖像(包括實時和欺騙)作為探測面。
人臉驗證的ROC曲線如圖8所示。可以觀察到,GFA-CNN在LFW上分別達到VGG16的競爭結果,分別為97.1%和97.6%。然而,當在SiW上測試時,GFA-CNN的精確度下降遠低于VGG16:GFA-CNN的準確度降低了4.5%,而VGG16降低了14%。性能下降主要是由于欺騙介質的面部再現,其中一些更精細的面部細節可能會丟失。然而,與VGG16相比,GFA-CNN仍然達到了令人滿意的性能。這主要是因為面部反欺騙和面部識別任務相互增強,使得面部識別學習的表現對欺騙模式不那么敏感。
?
?
?
We further evaluate the face recognition performance of our GFA-CNN on SiW and LFW. Since our method is not targeted specifically at face recognition, we only adopt VGG-16 as the baseline. On LFW, we follow the provided protocol to perform testing. On SiW we use 90 subjects for training and the other 75 subjects for testing, which is its default data splitting. This dataset also provides a frontal legacy face image corresponding to each subject. At the testing phase, we select the legacy image w.r.t each subject of the testing set as the gallery faces, and use all images in the testing set (including both live and spoofing) as the probe faces.
The ROC curves of face verification are shown in Fig. 8. As can be observed, GFA-CNN achieves competitive results to VGG16 on LFW, 97.1% and 97.6%, respectively. How- ever, when testing on SiW, the declined accuracy of GFA- CNN is much lower than that of VGG16: the accuracy of GFA-CNN reduces by 4.5%, while VGG16 drops by 14%. The degraded performance is mainly due to face reproduc- tion by spoofing mediums, in which some of the finer facial details might be lost. However, GFA-CNN still achieves satisfactory performance compared with VGG16. This is mainly because the face anti-spoofing and face recognition tasks mutually enhance each other, making the representa- tions learned for face recognition less sensitive to spoof patterns.
?
?
4.5. Discussions on Multi-task Setting
在本小節中,我們將研究多任務學習如何影響面部反欺騙的模型性能。我們在沒有面部識別分支的情況下重新訓練我們的模型,保持超參數不變并使用與GFA-CNN相同的協議進行評估。從實驗中,我們觀察到多任務訓練略微降低了面部反欺騙的測試內性能(分別在MSU-MFSD和Replay-Attack上下降2.5%和0.3%)。這是合理的,因為單個模型學會執行兩個不同的任務。然而,與單一任務訓練相比,實現了兩個優點。首先,訓練過程變得更加穩定,anti-loss損失逐漸減少,而不是經過單一任務訓練后的一些步驟急劇下降,這表明多任務設置有助于克服過度擬合。其次,如圖8所示,多任務訓練有助于學習對用于面部識別的欺騙模式不太敏感的面部表示。這主要得益于在卷積層中共享參數,提供更通用的融合特征。
?
?
In this subsection, we investigate how the multi-task learning affects model performance for face anti-spoofing. We retrain our model without the face recognition branch, keep hyper-parameters unchanged and evaluate with the same protocol as the GFA-CNN. From the experiments, we observe the multi-task training slightly decreases the intra- test performance of face anti-spoofing (dropping 2.5% and 0.3% on MSU-MFSD and Replay-Attack, respectively). This is reasonable, since the single model learns to perform two different tasks. However, two advantages are achieved compared with the single task training. Firstly, the training process becomes more stable with the Anti-loss decreasing gradually, rather than dropping sharply after some steps by single task training, suggesting multi-task setting can help overcome overfitting. Secondly, as shown in Fig. 8, multi- task training helps learn face representations less sensitive to spoof patterns for face recognition. This mainly benefits from sharing parameters in the convolutional layers, giving more generic fusion features.
?
5. Conclusion
本文提出了一種新穎的CNN模型,以相互促進的方式共同修飾人臉識別和面對反欺騙。 為了學習面部反欺騙的更具概括性的演示攻擊(PA)表示,我們提出了一種新的總成對混淆(TPC)損失來平衡每個欺騙模式的貢獻,防止PA表示過度擬合到數據集特定的欺騙模式。 快速域適應(FDA)也被納入我們的框架,以減少來自不同領域的面部樣本的分布不相似性,進一步增強PA表示的穩健性。 面部反欺騙和人臉識別數據集的廣泛實驗表明,我們的GFA-CNN不僅在交叉測試中實現了面部反欺騙的卓越性能,而且還實現了高精度的面部識別。
?
?
This paper presents a novel CNN model to jointly ad- dress face recognition and face anti-spoofing in a mutual boosting way. In order to learn more generalizable Pre- sentation Attack (PA) representations for face anti-spoofing, we propose a novel Total Pairwise Confusion (TPC) loss to balance the contribution of each spoof pattern, preventing the PA representations from overfitting to dataset-specific spoof patterns. The Fast Domain Adaptation (FDA) is also incorporated into our framework to reduce distribution dis- similarity of face samples from different domains, further enhancing the robustness of PA representations. Extensive experiments on both face anti-spoofing and face recognition datasets show that our GFA-CNN achieves not only superior performance for face anti-spoofing on cross-tests, but also high accuracy for face recognition.
?
Acknowledgement References
- [1] ?M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensor- flow: a system for large-scale machine learning. In OSDI, volume 16, pages 265–283, 2016. 5 ?
- [2] ?Z. Boulkenafet, J. Komulainen, Z. Akhtar, A. Benlam- oudi, D. Samai, S. E. Bekhouche, A. Ouafi, F. Dornaika, A. Taleb-Ahmed, L. Qin, et al. A competition on generalized software-based face presentation attack detection in mobile scenarios. In IJCB, pages 688–696, 2017. 6 ?
- [3] ?Z. Boulkenafet, J. Komulainen, and A. Hadid. Face anti- spoofing based on color texture analysis. In ICIP, pages 2636–2640, 2015. 6 ?
- [4] ?Z.Boulkenafet,J.Komulainen,andA.Hadid.Facespoofing detection using colour texture analysis. T-IFS, 11(8):1818– 1830, 2016. 6 ?
- [5] ?Z. Boulkenafet, J. Komulainen, and A. Hadid. Face anti- spoofing using speeded-up robust features and fisher vector encoding. IEEE Signal Processing Letters, 24(2):141–145, 2017. 1, 2, 6 ?
- [6] ?Z. Boulkenafet, J. Komulainen, and A. Hadid. On the gen- eralization of color texture-based face anti-spoofing. Image and Vision Computing, 77:1–9, 2018. 7 ?
- [7] ?Z.Boulkenafet,J.Komulainen,L.Li,X.Feng,andA.Hadid. Oulu-npu: A mobile face presentation attack database with real-world variations. In FG, pages 612–618, 2017. 4 ?
- [8] ?I.Chingovska,A.Anjos,andS.Marcel.Ontheeffectiveness of local binary patterns in face anti-spoofing. In BIOSIG, 2012. 1, 2, 4 ?
- [9] ?T.deFreitasPereira,A.Anjos,J.M.DeMartino,andS.Mar- cel. Can face anti-spoofing countermeasures work in a real world scenario? In ICB, pages 1–8, 2013. 7 ?
- [10] ?A. Dubey, O. Gupta, P. Guo, R. Raskar, R. Farrell, and N. Naik. Pairwise confusion for fine-grained visual classi- fication. In ECCV, pages 70–86, 2018. 3 ?
- [11] ?L. Engstrom. Fast style transfer. https://github. com/lengstrom/fast-style-transfer/, 2016. 2 ?
[12] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller. La- beled faces in the wild: A database forstudying face recog- nition in unconstrained environments. In ECCVW, 2008. 3
[13] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711, 2016. 4
[14] A. Jourabloo, Y. Liu, and X. Liu. Anti-spoofing via noise modeling. arXiv:1807.09968, 2018. 7
Face de-spoofing:
arXiv preprint
[15] K. Kollreider, H. Fronthaler, M. I. Faraj, and J. Bigun. Real- time face detection and motion analysis with application in liveness assessment. T-IFS, 2(3):548–558, 2007. 2
[16] J.Komulainen,A.Hadid,andM.Pietikainen.Contextbased face anti-spoofing. In BTAS, pages 1–8, 2013. 1, 2
[17] H. Li, P. He, S. Wang, A. Rocha, X. Jiang, and A. C. Kot. Learning generalized deep feature representation for face anti-spoofing. T-IFS, 13(10):2639–2652, 2018. 2
[18] H. Li, W. Li, H. Cao, S. Wang, F. Huang, and A. C. Kot. Unsupervised domain adaptation for face anti-spoofing. T- IFS, 13(7):1794–1809, 2018. 1
[19] Y. Liu, A. Jourabloo, and X. Liu. Learning deep models for face anti-spoofing: Binary or auxiliary supervision. In CVPR, pages 389–398, 2018. 1, 2, 4, 7
[20] G. Pan, L. Sun, Z. Wu, and S. Lao. Eyeblink-based anti- spoofing in face recognition from a generic webcamera. 2007. 2
[21] X. Pan, P. Luo, J. Shi, and X. Tang. Two at once: Enhanc- ing learning and generalization capacities via ibn-net. arXiv preprint arXiv:1807.09441, 2018. 4, 5
[22] O. M. Parkhi, A. Vedaldi, A. Zisserman, et al. Deep face recognition. In BMVC, volume 1, page 6, 2015. 5
[23] M. Sajjad, S. Khan, T. Hussain, K. Muhammad, A. K. San- gaiah, A. Castiglione, C. Esposito, and S. W. Baik. Cnn- based anti-spoofing two-tier multi-factor authentication sys- tem. Pattern Recognition Letters, 2018. 3
[24] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 6
[25] X. Tan, Y. Li, J. Liu, and L. Jiang. Face liveness detection from a single image with sparse low rank bilinear discrimi- native model. In ECCV, pages 504–517, 2010. 1
[26] D. Wen, H. Han, and A. K. Jain. Face spoof detection with image distortion analysis. T-IFS, 10(4):746–761, 2015. 4, 6 [27] J. Yang, Z. Lei, and S. Z. Li. Learn convolutional neural net-
work for face anti-spoofing. arXiv preprint arXiv:1408.5601,
2014. 1, 2, 7?[28] Z. Zhang, J. Yan, S. Liu, Z. Lei, D. Yi, and S. Z. Li. A face
antispoofing database with diverse attacks. In ICB, pages 26–31, 2012. 4
?
總結
以上是生活随笔為你收集整理的【论文翻译】Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: day16-简单网页数据爬取
- 下一篇: 电容麦克风的幻象供电