ERK'2022, Portorož, 381-384 381
Face Morphing Attack Detection Using
Privacy-Aware Training Data
Marija Ivanovska
1
, Andrej Kronovˇ sek
2
, Peter Peer
2
, Vitomir
ˇ
Struc
1
, Borut Batagelj
2
1
Faculty of Electrical Engineering, University of Ljubljana, Trˇ zaˇ ska cesta 25, SI-1000 Ljubljana, Slovenia
2
Faculty of Computer and Information Science, University of Ljubljana, Veˇ cna pot 113, SI-1000 Ljubljana, Slovenia
E-mail: marija.ivanovska@fe.uni-lj.si
Abstract
Images of morphed faces pose a serious threat to face
recognition–based security systems, as they can be used
to illegally verify the identity of multiple people with a
single morphed image. Modern detection algorithms learn
to identify such morphing attacks using authentic images
of real individuals. This approach raises various pri-
vacy concerns and limits the amount of publicly avail-
able training data. In this paper, we explore the efficacy
of detection algorithms that are trained only on faces of
non–existing people and their respective morphs. To this
end, two dedicated algorithms are trained with synthetic
data and then evaluated on three real-world datasets, i.e.:
FRLL-Morphs, FERET-Morphs and FRGC-Morphs. Our
results show that synthetic facial images can be success-
fully employed for the training process of the detection
algorithms and generalize well to real-world scenarios.
1 Introduction
Nowadays, the vast majority of applications for person
identity verification rely on Face Recognition Systems
(FRSs), which match a human face to an entry from a
database of faces. Modern FRSs have proven to be highly
accurate when genuine faces are presented to the system
[7]. They are however prone to various attacks, whose
aim is to gain illegal access by false authentication [8].
Lately, face morphs have become a growing concern
for the reliability of face verification systems. A face
morph is a composite image generated from two (or more)
facial images of distinct subjects. Recent advances in
generative deep models have enabled an almost effortless
generation of realistic and high-quality morphed facial
images. Such images can be utilized to verify all identi-
ties that have been used in the morph-generation process.
A successful detection of face morphing attacks is there-
fore critical for the prevention of illegal activities [4].
Various Morphing Attack Detection (MAD) algori-
thms have been proposed over the years to automatically
distinguish real from morphed faces. However, regard-
less of the detection technique, the training of these mod-
els requires a massive database of genuine face images.
Training protocols therefore raise various privacy-related
Supported in parts by the ARRS project J2-1734 (B), and the ARRS
research programmes P2-0250 (B) and P2-0214 (B).
StyleGAN
MAD
bona fide
morphing attack
privacy-aware training data
morphing
real-world testing data
Figure 1: To avoid privacy related concerns in the development
of morphing attack detectors (MAD), we explore the idea of
using synthetic training faces of non–existing people. Trained
MADs are then tested on real–world datasets.
concerns and limit the amount of publicly available train-
ing data that can be used to learn MAD models. In this
paper, we address the privacy issues related to MADs
by exploring the idea of using synthetic training data,
as illustrated in Figure 1. For this purpose, we use the
SMDD [3] dataset, where StyleGAN2 [10], a state-of-
the-art Generative Adversarial Network (GAN), was uti-
lized to generate bona fide face images of non-existing
people. These images were then used for the generation
of the face morphs. With this dataset, we train two power-
ful binary classifiers, Xception and HRNet, and evaluate
their detection performance on three real-world datasets –
FRLL-Morphs, FERET-Morphs and FRGC-Morphs. The
results of the evaluation show that well-performing MAD
models can be learned from synthetic data alone, and that
the model generalize well over three diverse real-world
morph datasets.
2 Related work
Existing morphing-attack-detection models can in gen-
eral be categorized as single–image (S-MAD) or differ-
ential (D-MAD) MADs, depending on whether the face
morph is examined independently or is compared to a ref-
erence sample. While D-MADs can be very accurate in
closed–set problems, S-MADs aim to detect attacks with-
out any prior knowledge about human identities. In this
section, we only review S-MADs, since they are more
closely related to our work.
Regardless of the face morphing technique used, the
382
generated morphs usually contain image irregularities su-
ch as artifacts, noise, pixel discontinuity, distortions, spec-
trum discrepancies, inconsistent illumination, etc. In the
past, shallow algorithms, that implement extraction of
photo-response non-uniformity (PRNU) noise [20] or re-
flection analysis [21] have been successfully employed
for the detection of morphing attacks. Some other hand–
crafted MAD methods have also used texture–based de-
scriptors, such as LBP [13], LPQ [14] or SURF [11]. Al-
though these methods achieved promising results, they
were shown to have limited generalization capabilities.
Moreover, as the face morphing techniques improved over
time, the performance of shallow methods became less
competitive, as they struggled to detect modern, deep-
learning generated or heavily post–processed face morphs.
More recent MAD models take advantage of the de-
velopment of data–driven, deep-learning algorithms. Rag-
havendra et al. [17] were amongst the first to propose
transfer learning. In their work, attacks are detected with
a simple, fully–connected binary classifier, fed with fused
VGG19 and AlexNet features, pretrained on ImageNet.
Wandzik et al. [23], on the other hand, achieve highly
accurate results with features from general–purpose face
recognition systems (FRSs) combined with an SVM. Ra-
machandra et al. [18] utilize Inception in a similar man-
ner, while Damer et al. [4] argue, that pixel–wise super-
vision, where each pixel is classified as a bona fide or a
morphing attack, is superior, when used in addition to the
binary, image–level objective. Recently, MixFaceNet [1]
by Boutros et al. achieved state–of–the–art results in dif-
ferent face–related detection tasks, including face morph-
ing detection [3]. This model represents a highly efficient
architecture that captures different levels of face attack
cues by using differently sized convolutional kernels.
3 Methods
We consider two different classification models, Xcep-
tion and HRNet, to detect face morphing attacks in this
study and train them using synthetic data only. The two
models represent the entries from the University of Ljubl-
jana to the recent Face Morphing Attack Detection Com-
petition based on Privacy-aware Synthetic Training Data
(SYN-MAD), held in conjunction with the 2022 Interna-
tional Joint Conference on Biometrics (IJCB 2022) [8],
which achieved the best and third best overall performance
among all submitted entries.
Xception [2] is a convolutional neural network (CNN)
that updates and simplifies the architecture of the Incep-
tionV3 model [22] by replacing the Inception modules
with depth–wise separable convolutions. We use Xcep-
tion as a feature extractor, while the binary classification
is performed by a fully connected two–layer network.
The output layer consist of2 neurons, followed by a soft-
max activation function. Similar to previous research, we
use cross–entropy as the learning objective.
HRNet [24] is again a CNN that unlike other net-
works maintains high–resolution representations of the
input sample through the whole feedforward process. Su-
ch an architecture contributes to more descriptive image
representations, which was proven to improve the results
SMDD FRLL
+ =
bona fide morphing attack
FERET FRGC
+ =
+ =
Figure 2: Examples of bona fide and morphing attack images
from the SMDD [3] training dataset and the testing datasets
FERET-Morphs, FRLL-Morphs and FRGC-Morphs [19].
Table 1: Number of bona fide images (BF), number of morphing
attacks (MA) generated by morphing methods OpenCV (OCV),
FaceMorpher (FM), StyleGAN (SG), AMSL, Webmorpf (WM)
and image size of samples in each dataset.
Dataset Image size BF OCV FM SG AMSL WM
FRLL-M 1350× 1350 204 1221 1222 1222 2175 1221
FERET-M 512× 768 1. 413 529 529 529 / /
FRGC-M 227× 277 3. 167 964 964 964 / /
of different classification tasks. In our experiments, we
replace the classification head of HRNet with a two–layer
classification module, to perform binary detection of bona
fide images and morphing attacks. The output layer con-
sists of only one neuron, followed by a sigmoid activation
function. In the training phase, the parameters are opti-
mized using the binary cross–entropy loss.
4 Experiments
4.1 Datasets
We use one synthetic and three publicly available real–
world datasets in this work. The training is done exclu-
sively with the synthetic data, while the evaluation is per-
formed on three commonly used face morphing datasets.
Training data. For training, we use the SMDD data-
set [3], provided by the organizers of the SYN-MAD com-
petition [8]. The dataset consists of 25. 000 bona fide and
15. 000 morphed images of size 256× 256 pixels. Bona
fide instances represent carefully selected images from a
set of randomly generated StyleGAN2 [10, 9] faces. A
separate, non – overlapping StyleGAN2 image set was
used for the generation of face morphing attacks. Face
morphs were created using the landmark–based morph-
ing technique from OpenCV
1
. A few selected samples
from the SMDD dataset are presented in Figure 2.
Testing data. The trained MAD models are tested on
three diverse morphing datasets proposed by Sarkar et al.
in [19], i.e. FRLL- Morphs, FERET-Morphs and FRGC-
Morphs. All face morphs were created by combining
bona fide images from their respective face datasets, i.e.
FRLL [5, 12], FERET [16] and FRGC [15]. To gener-
ate landmark–based morphs, the authors used OpenCV
and FaceMorpher
2
, while deep-learning–based morphs
are generated with StyleGAN2. In addition to these meth-
ods, AMSL[12] and Webmorph
3
are also used, but only
for the images from the FRLL dataset. Information about
image sizes and the number of samples per morphing
1
https://learnopencv.com/face-morph-using-opencv-cpp-python/
2
https://github.com/alyssaq/face morpher
3
https://webmorph.org/
383
Figure 3: ROC curves generated on FRLL-Morphs, FERET-Morphs and FRGC-Morphs for the tested models. Note that HRNet
achieves very competitive results on FERET-Morphs and FRGC-Morphs, but performs the weakest on FRLL-Morphs. Xception
on the other hand, consistently outperforms the baseline method MixFaceNet, on all considered datasets.
method is given in Table 1. Selected samples from all
three datasets are presented in Figure 2.
4.2 Experimental setup
In our experiments, we first preprocess images from all
datasets by cropping out the facial areas. Bounding boxes
of the SMDD are provided by the authors of the dataset.
For the other three databases, we use RetinaFace [6], to
localize the facial region-of-interest. Prior to their use,
cropped images are resized to299× 299 pixels for Xcep-
tion and 256× 256 pixels for HRNet. Additionally, the
training data is augmented with horizontal flips to in-
crease the amount of data available and avoid overfitting.
The CNNs were optimized using the Adam optimizer,
with a learning rate of 0. 0001. The models were trained
from scratch for 30 full epochs, with a batch size of 16.
After each training epoch, the classification accuracy of
the networks was calculated on a small holdout set of
each test dataset. The best performing parameters on each
of the three datasets were saved as the final model for that
particular dataset. The code was implemented in Python
3.8 with PyTorch 1.9 and CUDA 11.6. Experiments were
run on a single GeForce GTX 1080 Ti. The computa-
tional complexity of Xception is 11 GFLOPs, while HR-
Net has 34 GFLOPs.
5 Results
In Figure 3 and Table 2 we present the results obtained
with our two models, Xception and HRNet, and the base-
line MixFaceNet-MAD from [8]. The weights of MixFa-
ceNet-MAD, optimized on the SMDD dataset, were pro-
vided by the authors of the model. In our experiments,
the best overall results were achieved by Xception, whose
Equal Error Rates (EER) are3. 26%,8. 25% and9. 75% on
FRRL–Morphs, FERET-Morphs and FRGC–Morphs, re-
spectively (Table 2). The runner-up, HRNet, achieves a
similar performance on FERET-Morphs and FRGC-Mo-
rphs. However, among the tested models, HRNet is least
successful on FRLL-Morphs, where it achieves an EER
of 13. 73%. On this dataset, MixFaceNet yields a slightly
better performance than HRNet with an EER of 12. 18%,
but is outperformed by both, Xception and HRNet, on
the other two databases, i.e. FERET-Morphs and FRGC-
Morphs. The complete ROC curves of the experiments
Table 2: Detection results for MAD methods MixFaceNet
(MFN), Xception (XN) and HRNet (HRN) on the real–world
datasets FRLL-Morphs (FRLL-M), FERET-Morphs (FERET-
M) and FRGC-Morphs (FRGC-M). Best scores per dataset are
marked blue, while runner–up results are marked orange. All
three models were trained on the synthetic SMDD dataset [3].
MAD Test data AUC(%) EER(%)
BPCER (%) @ APCER =
0. 10% 1. 00% 10. 00% 20. 00%
MFN [3]
FRLL-M 95.43 12.18 100. 0 100. 0 15.20 5.88
FERET-M 94. 27 10. 65 100. 0 100. 0 11. 75 6. 51
FRGC-M 91. 42 16. 36 100. 0 64. 86 25. 89 14. 02
XN [2]
FRLL-M 99.17 3.26 85.29 28.92 0.49 0.0
FERET-M 96.84 8.25 79.62 43.31 7.29 4.03
FRGC-M 96.63 9.75 58.19 35.11 9.44 4.23
HRN [24]
FRLL-M 92. 79 13. 73 100. 00 42.84 18. 65 11. 12
FERET-M 97.05 8.49 91.43 54.00 7. 44 2.27
FRGC-M 95.77 10.89 82.26 55.50 12.14 4.46
are visualized in Figure 3 and show a similar picture as
the discussed numerical results.
To better understand the differences between the eval-
uated models, we additionally assess their performance
on only one face morphing technique at a time. As can be
seen from Figure 4, Xception shows the greatest general-
ization capabilities, when it comes to detection of differ-
ent face morphing methods. HRNet provides very com-
petitive results on face morphs generated by OpenCV ,
FaceMorpher and StyleGAN. Nevertheless, AMSL and
Webmorph attacks are too challenging for this model.
We hypothesize, that this might be due to the structure of
the training data. Synthetic face morphs from SMDD are
generated using only one morphing method, i.e. OpenCV .
With such training data, models are in general at higher
risk of overfitting to one specific type of morph. Since
HRNet has far more trainable parameters than Xception
and MixFaceNet, it is also more prone to overfitting, when
trained on smaller datasets like SMDD.
6 Conclusion
In this paper, we tackle the privacy issues associated with
the datasets used for the development of face morphing
detection algorithms. To address related privacy concerns,
we explore the idea of using a training database with
faces of non–existing people, generated by StyleGAN.
Using this data, we train three different MAD models
and evaluate their performance on three commonly used
384
Figure 4: ROC curves generated for the tested MAD models for different types of face morphs. Note that the performance of the
detectors differs quite considerably depending on the morphing procedure used.
real–world datasets. Our experiments show that in gen-
eral, MAD models can be successfully trained on syn-
thetic data and generalize well to real–world scenarios.
References
[1] F. Boutros, N. Damer, M. Fang, F. Kirchbuchner, and A. Kuijper.
Mixfacenets: Extremely efficient face recognition networks. In
IEEE IJCB, pages 1–8, 2021.
[2] F. Chollet. Xception: Deep Learning with Depthwise Separable
Convolutions. In IEEE CVPR, pages 1800–1807, 2017.
[3] N. Damer, C. A. F. L´ opez, M. Fang, N. Spiller, M. V . Pham, and
F. Boutros. Privacy-Friendly Synthetic Data for the Development
of Face Morphing Attack Detectors. IEEE CVPRW, pages 1606–
1617, 2022.
[4] N. Damer, N. Spiller, M. Fang, F. Boutros, F. Kirchbuchner, and
A. Kuijper. PW-MAD: Pixel-Wise Supervision for Generalized
Face Morphing Attack Detection. In Advances in Visual Comput-
ing, pages 291–304. Springer International Publishing, 2021.
[5] L. DeBruine and B. Jones. Face Research Lab London Set, 2017.
[6] J. Deng, J. Guo, E. Ververas, I. Kotsia, and S. Zafeiriou. Reti-
naFace: Single-Shot Multi-Level Face Localisation in the Wild.
In IEEE CVPR, pages 5202–5211, 2020.
[7] K. Grm, V .
ˇ
Struc, A. Artiges, M. Caron, and H. K. Ekenel.
Strengths and weaknesses of deep learning models for face recog-
nition against image degradations. IET Biometrics, 7(1):81–89,
2018.
[8] M. Huber, F. Boutros, A. Thi Luu, K. Raja, R. Ramachandra,
N. Damer, P. C. Neto, T. Goncalves, A. F. Sequeira, J. S. Car-
doso, T. Joao, M. Lourenc, S. Serra, E. Cermeno, M. Ivanovska,
B. Batagelj, A. Kronovsek, P. Peer, and V . Struc. SYN-MAD
2022: Competition on Face Morphing Attack Detection Based on
Privacy-aware Synthetic Training Data. In IEEE IJCB, 2022.
[9] T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and
T. Aila. Training Generative Adversarial Networks with Limited
Data. In NIPS, 2020.
[10] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and
T. Aila. Analyzing and Improving the Image Quality of Style-
GAN. In IEEE/CVF CVPR, pages 8110–8119, 2020.
[11] A. Makrushin, C. Kraetzer, J. Dittmann, C. Seibold, A. Hilsmann,
and P. Eisert. Dempster-Shafer Theory for Fusing Face Morphing
Detectors. In EUSIPCO, pages 1–5, 2019.
[12] T. Neubert, A. Makrushin, M. Hildebrandt, C. Kraetzer, and
J. Dittmann. Extended StirTrace benchmarking of biometric
and forensic qualities of morphed face images. IET Biometrics,
7(4):325–332, 2018.
[13] T. Ojala, M. Pietik¨ ainen, and D. Harwood. A comparative study
of texture measures with classification based on featured distribu-
tions. Pattern Recognition, 29(1):51–59, 1996.
[14] V . Ojansivu and J. Heikkil¨ a. Blur Insensitive Texture Classifica-
tion Using Local Phase Quantization. In A. Elmoataz, O. Lezoray,
F. Nouboud, and D. Mammass, editors, Image and Signal Pro-
cessing, pages 236–243. Springer Berlin Heidelberg, 2008.
[15] P. Phillips, P. Flynn, T. Scruggs, K. Bowyer, J. Chang, K. Hoff-
man, J. Marques, J. Min, and W. Worek. Overview of the face
recognition grand challenge. In IEEE CVPR, volume 1, pages
947–954 vol. 1, 2005.
[16] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss. The
FERET database and evaluation procedure for face-recognition
algorithms. Image and Vision Computing, 16(5):295–306, 1998.
[17] R. Raghavendra, K. B. Raja, S. Venkatesh, and C. Busch. Trans-
ferable Deep-CNN Features for Detecting Digital and Print-
Scanned Morphed Face Images. In IEEE CVPRW, pages 1822–
1830, 2017.
[18] R. Ramachandra, S. Venkatesh, K. Raja, and C. Busch. Detect-
ing Face Morphing Attacks with Collaborative Representation of
Steerable Features. In CVIP, pages 255–265, 2020.
[19] E. Sarkar, P. Korshunov, L. Colbois, and S. Marcel. Vulnerability
Analysis of Face Morphing Attacks from Landmarks and Gener-
ative Adversarial Networks. 2020.
[20] U. Scherhag, L. Debiasi, C. Rathgeb, C. Busch, and A. Uhl.
Detection of Face Morphing Attacks Based on PRNU Analysis.
IEEE TBBIS, 1(4):302–317, 2019.
[21] C. Seibold, A. Hilsmann, and P. Eisert. Reflection Analysis for
Face Morphing Attack Detection. In EUSIPCO, pages 1022–
1026, 2018.
[22] C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Re-
thinking the Inception Architecture for Computer Vision. CoRR,
2015.
[23] L. Wandzik, G. Kaeding, and R. V . Garcia. Morphing Detection
Using a General-Purpose Face Recognition System. In EUSIPCO,
pages 1012–1016, 2018.
[24] J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y . Zhao,
D. Liu, Y . Mu, M. Tan, X. Wang, W. Liu, and B. Xiao. Deep
High-Resolution Representation Learning for Visual Recognition.
TPAMI, 2019.