Ivanovska, Marija; Kronovšek, Andrej; Peer, Peter; Štruc, Vitomir; Batagelj, Borut
In: Proceedings of ERK 2022, pp. 1-4, 2022.
Images of morphed faces pose a serious threat to face recognition--based security systems, as they can be used to illegally verify the identity of multiple people with a single morphed image. Modern detection algorithms learn to identify such morphing attacks using authentic images of real individuals. This approach raises various privacy concerns and limits the amount of publicly available training data. In this paper, we explore the efficacy of detection algorithms that are trained only on faces of non--existing people and their respective morphs. To this end, two dedicated algorithms are trained with synthetic data and then evaluated on three real-world datasets, i.e.: FRLL-Morphs, FERET-Morphs and FRGC-Morphs. Our results show that synthetic facial images can be successfully employed for the training process of the detection algorithms and generalize well to real-world scenarios.
Križaj, Janez; Dobrišek, Simon; Štruc, Vitomir
In: Sensors, iss. 6, no. 2388, pp. 1-26, 2022.
Most commercially successful face recognition systems combine information from multiple sensors (2D and 3D, visible light and infrared, etc.) to achieve reliable recognition in various environments. When only a single sensor is available, the robustness as well as efficacy of the recognition process suffer. In this paper, we focus on face recognition using images captured by a single 3D sensor and propose a method based on the use of region covariance matrixes and Gaussian mixture models (GMMs). All steps of the proposed framework are automated, and no metadata, such as pre-annotated eye, nose, or mouth positions is required, while only a very simple clustering-based face detection is performed. The framework computes a set of region covariance descriptors from local regions of different face image representations and then uses the unscented transform to derive low-dimensional feature vectors, which are finally modeled by GMMs. In the last step, a support vector machine classification scheme is used to make a decision about the identity of the input 3D facial image. The proposed framework has several desirable characteristics, such as an inherent mechanism for data fusion/integration (through the region covariance matrixes), the ability to explore facial images at different levels of locality, and the ability to integrate a domain-specific prior knowledge into the modeling procedure. Several normalization techniques are incorporated into the proposed framework to further improve performance. Extensive experiments are performed on three prominent databases (FRGC v2, CASIA, and UMB-DB) yielding competitive results.
Rot, Peter; Peer, Peter; Štruc, Vitomir
Detecting Soft-Biometric Privacy Enhancement Incollection
In: Rathgeb, Christian; Tolosana, Ruben; Vera-Rodriguez, Ruben; Busch, Christoph (Ed.): Handbook of Digital Face Manipulation and Detection, 2022.
Ivanovska, Marija; Štruc, Vitomir
In: Proceedings of ERK 2021, pp. 1–4, 2021.
Deepfakes or manipulated face images, where a donor's face is swapped with the face of a target person, have gained enormous popularity among the general public recently. With the advancements in artificial intelligence and generative modeling
such images can nowadays be easily generated and used to spread misinformation and harm individuals, businesses or society. As the tools for generating deepfakes are rapidly improving, it is critical for deepfake detection models to be able to recognize advanced, sophisticated data manipulations, including those that have not been seen during training. In this paper, we explore the use of one--class learning models as an alternative to discriminative methods for the detection of deepfakes. We conduct a comparative study with three popular deepfake datasets and investigate the performance of selected (discriminative and one-class) detection models in matched- and cross-dataset experiments. Our results show that disciminative models significantly outperform one-class models when training and testing data come from the same dataset, but degrade considerably when the characteristics of the testing data deviate from the training setting. In such cases, one-class models tend to generalize much better.
Grm, Klemen; Vitomir, Štruc
Frequency Band Encoding for Face Super-Resolution Inproceedings
In: Proceedings of ERK 2021, pp. 1-4, 2021.
In this paper, we present a novel method for face super-resolution based on an encoder-decoder architecture. Unlike previous approaches, which focused primarily on directly reconstructing the high-resolution face appearance from low-resolution images, our method relies on a multi-stage approach where we learn a face representation in different frequency bands, followed by decoding the representation into a high-resolution image. Using quantitative experiments, we are able to demonstrate that this approach results in better face image reconstruction, as well as aiding in downstream semantic tasks such as face recognition and face verification.
Batagelj, Borut; Peer, Peter; Štruc, Vitomir; Dobrišek, Simon
In: Applied sciences, vol. 11, no. 5, pp. 1-24, 2021, ISBN: 2076-3417.
The new Coronavirus disease (COVID-19) has seriously affected the world. By the end of November 2020, the global number of new coronavirus cases had already exceeded 60 million and the number of deaths 1,410,378 according to information from the World Health Organization (WHO). To limit the spread of the disease, mandatory face-mask rules are now becoming common in public settings around the world. Additionally, many public service providers require customers to wear face-masks in accordance with predefined rules (e.g., covering both mouth and nose) when using public services. These developments inspired research into automatic (computer-vision-based) techniques for face-mask detection that can help monitor public behavior and contribute towards constraining the COVID-19 pandemic. Although existing research in this area resulted in efficient techniques for face-mask detection, these usually operate under the assumption that modern face detectors provide perfect detection performance (even for masked faces) and that the main goal of the techniques is to detect the presence of face-masks only. In this study, we revisit these common assumptions and explore the following research questions: (i) How well do existing face detectors perform with masked-face images? (ii) Is it possible to detect a proper (regulation-compliant) placement of facial masks? and (iii) How useful are existing face-mask detection techniques for monitoring applications during the COVID-19 pandemic? To answer these and related questions we conduct a comprehensive experimental evaluation of several recent face detectors for their performance with masked-face images. Furthermore, we investigate the usefulness of multiple off-the-shelf deep-learning models for recognizing correct face-mask placement. Finally, we design a complete pipeline for recognizing whether face-masks are worn correctly or not and compare the performance of the pipeline with standard face-mask detection models from the literature. To facilitate the study, we compile a large dataset of facial images from the publicly available MAFA and Wider Face datasets and annotate it with compliant and non-compliant labels. The annotation dataset, called Face-Mask-Label Dataset (FMLD), is made publicly available to the research community.
Grm, Klemen; Scheirer, Walter J.; Štruc, Vitomir
In: IEEE Transactions on Image Processing, 2020.
In this paper we address the problem of hallucinating high-resolution facial images from low-resolution inputs at high magnification factors. We approach this task with convolutional neural networks (CNNs) and propose a novel (deep) face hallucination model that incorporates identity priors into the learning procedure. The model consists of two main parts: i) a cascaded super-resolution network that upscales the lowresolution facial images, and ii) an ensemble of face recognition models that act as identity priors for the super-resolution network during training. Different from most competing super-resolution techniques that rely on a single model for upscaling (even with large magnification factors), our network uses a cascade of multiple SR models that progressively upscale the low-resolution images using steps of 2×. This characteristic allows us to apply supervision signals (target appearances) at different resolutions and incorporate identity constraints at multiple-scales. The proposed C-SRIP model (Cascaded Super Resolution with Identity Priors) is able to upscale (tiny) low-resolution images captured in unconstrained conditions and produce visually convincing results for diverse low-resolution inputs. We rigorously evaluate the proposed model on the Labeled Faces in the Wild (LFW), Helen and CelebA datasets and report superior performance compared to the existing state-of-the-art.
Grm, Klemen; Pernus, Martin; Cluzel, Leo; Scheirer, Walter J.; Dobrisek, Simon; Struc, Vitomir
In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019.
Contemporary face hallucination (FH) models exhibit considerable ability to reconstruct high-resolution (HR) details from low-resolution (LR) face images. This ability is commonly learned from examples of corresponding HR-LR image pairs, created by artificially down-sampling the HR ground truth data. This down-sampling (or degradation) procedure not only defines the characteristics of the LR training data, but also determines the type of image degradations the learned FH models are eventually able to handle. If the image characteristics encountered with real-world LR images differ from the ones seen during training, FH models are still expected to perform well, but in practice may not produce the desired results. In this paper we study this problem and explore the bias introduced into FH models by the characteristics of the training data. We systematically analyze the generalization capabilities of several FH models in various scenarios where the degradation function does not match the training setup and conduct experiments with synthetically downgraded as well as real-life low-quality images. We make several interesting findings that provide insight into existing problems with FH models and point to future research directions.
Meden, Blaz; Peer, Peter; Struc, Vitomir
In: 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), pp. 1–7, IEEE 2018.
Privacy is a highly debatable topic in the modern technological era. With the advent of massive video and image data (which in a lot of cases contains personal information on the recorded subjects), there is an imminent need for efficient privacy protection mechanisms. To this end, we develop in this work a novel Face Deidentification Network (FaDeNet) that is able to alter the input faces in such a way that automated recognition fail to recognize the subjects in the images, while this is still possible for human observers. FaDeNet is based an encoder-decoder architecture that is trained to auto-encode the input image, while (at the same time) minimizing the recognition performance of a secondary network that is used as an socalled identity critic in FaDeNet. We present experiments on the Radbound Faces Dataset and observe encouraging results.
Grm, Klemen; Štruc, Vitomir
Deep face recognition for surveillance applications Journal Article
In: IEEE Intelligent Systems, vol. 33, no. 3, pp. 46–50, 2018.
Automated person recognition from surveillance quality footage is an open research problem with many potential application areas. In this paper, we aim at addressing this problem by presenting a face recognition approach tailored towards surveillance applications. The presented approach is based on domain-adapted convolutional neural networks and ranked second in the International Challenge on Biometric Recognition in the Wild (ICB-RW) 2016. We evaluate the performance of the presented approach on part of the Quis-Campi dataset and compare it against several existing face recognition techniques and one (state-of-the-art) commercial system. We find that the domain-adapted convolutional network outperforms all other assessed techniques, but is still inferior to human performance.
Meden, Blaž; Emeršič, Žiga; Štruc, Vitomir; Peer, Peter
In: Entropy, vol. 20, no. 1, pp. 60, 2018.
Image and video data are today being shared between government entities and other relevant stakeholders on a regular basis and require careful handling of the personal information contained therein. A popular approach to ensure privacy protection in such data is the use of deidentification techniques, which aim at concealing the identity of individuals in the imagery while still preserving certain aspects of the data after deidentification. In this work, we propose a novel approach towards face deidentification, called k-Same-Net, which combines recent Generative Neural Networks (GNNs) with the well-known k-Anonymitymechanism and provides formal guarantees regarding privacy protection on a closed set of identities. Our GNN is able to generate synthetic surrogate face images for deidentification by seamlessly combining features of identities used to train the GNN model. Furthermore, it allows us to control the image-generation process with a small set of appearance-related parameters that can be used to alter specific aspects (e.g., facial expressions, age, gender) of the synthesized surrogate images. We demonstrate the feasibility of k-Same-Net in comprehensive experiments on the XM2VTS and CK+ datasets. We evaluate the efficacy of the proposed approach through reidentification experiments with recent recognition models and compare our results with competing deidentification techniques from the literature. We also present facial expression recognition experiments to demonstrate the utility-preservation capabilities of k-Same-Net. Our experimental results suggest that k-Same-Net is a viable option for facial deidentification that exhibits several desirable characteristics when compared to existing solutions in this area.
Klemen, Grm; Simon, Dobrišek; Vitomir, Štruc
In: Proceedings of the Twenty-sixth International Electrotechnical and Computer Science Conference ERK 2017, 2017.
With recent advancements in deep learning and convolutional neural networks (CNNs), face recognition has seen significant performance improvements over the last few years. However, low-resolution images still remain challenging, with CNNs performing relatively poorly compared to humans. One possibility to improve performance in these settings often advocated in the literature is the use of super-resolution (SR). In this paper, we explore the usefulness of SR algorithms for cross-resolution face recognition in experiments on the Labeled Faces in the Wild (LFW) and SCface datasets using four recent deep CNN models. We conduct experiments with synthetically down-sampled images as well as real-life low-resolution imagery captured by surveillance cameras. Our experiments show that image super-resolution can improve face recognition performance considerably on very low-resolution images (of size 24 x 24 or 32 x 32 pixels), when images are artificially down-sampled, but has a lesser (or sometimes even a detrimental) effect with real-life images leaving significant room for further research in this area.
Meden, Blaž; Malli, Refik Can; Fabijan, Sebastjan; Ekenel, Hazim Kemal; Štruc, Vitomir; Peer, Peter
Face deidentification with generative deep neural networks Journal Article
In: IET Signal Processing, vol. 11, no. 9, pp. 1046–1054, 2017.
Face deidentification is an active topic amongst privacy and security researchers. Early deidentification methods relying on image blurring or pixelisation have been replaced in recent years with techniques based on formal anonymity models that provide privacy guaranties and retain certain characteristics of the data even after deidentification. The latter aspect is important, as it allows the deidentified data to be used in applications for which identity information is irrelevant. In this work, the authors present a novel face deidentification pipeline, which ensures anonymity by synthesising artificial surrogate faces using generative neural networks (GNNs). The generated faces are used to deidentify subjects in images or videos, while preserving non-identity-related aspects of the data and consequently enabling data utilisation. Since generative networks are highly adaptive and can utilise diverse parameters (pertaining to the appearance of the generated output in terms of facial expressions, gender, race etc.), they represent a natural choice for the problem of face deidentification. To demonstrate the feasibility of the authors’ approach, they perform experiments using automated recognition tools and human annotators. Their results show that the recognition performance on deidentified images is close to chance, suggesting that the deidentification process based on GNNs is effective.
Meden, Blaz; Emersic, Ziga; Struc, Vitomir; Peer, Peter
k-Same-Net: Neural-Network-Based Face Deidentification Inproceedings
In: 2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI), pp. 1–7, IEEE 2017.
An increasing amount of video and image data is being shared between government entities and other relevant stakeholders and requires careful handling of personal information. A popular approach for privacy protection in such data is the use of deidentification techniques, which aim at concealing the identity of individuals in the imagery while still preserving certain aspects of the data deidentification. In this work, we propose a novel approach towards face deidentification, called k-Same-Net, which combines recent generative neural networks (GNNs) with the well-known k-anonymity mechanism and provides formal guarantees regarding privacy protection on a closed set of identities. Our GNN is able to generate synthetic surrogate face images for dedentification by seamlessly combining features of identities used to train the GNN mode. furthermore, it allows us to guide the image-generation process with a small set of appearance-related parameters that can be used to alter specific aspects (e.g., facial expressions, age, gender) of the synthesized surrogate images. We demonstrate the feasibility of k-Same-Net in comparative experiments with competing techniques on the XM2VTS dataset and discuss the main characteristics of our approach.