Grm, Klemen; Vitomir, Štruc
Frequency Band Encoding for Face Super-Resolution Inproceedings
In: Proceedings of ERK 2021, pp. 1-4, 2021.
In this paper, we present a novel method for face super-resolution based on an encoder-decoder architecture. Unlike previous approaches, which focused primarily on directly reconstructing the high-resolution face appearance from low-resolution images, our method relies on a multi-stage approach where we learn a face representation in different frequency bands, followed by decoding the representation into a high-resolution image. Using quantitative experiments, we are able to demonstrate that this approach results in better face image reconstruction, as well as aiding in downstream semantic tasks such as face recognition and face verification.
Grm, Klemen; Scheirer, Walter J.; Štruc, Vitomir
In: IEEE Transactions on Image Processing, 2020.
In this paper we address the problem of hallucinating high-resolution facial images from low-resolution inputs at high magnification factors. We approach this task with convolutional neural networks (CNNs) and propose a novel (deep) face hallucination model that incorporates identity priors into the learning procedure. The model consists of two main parts: i) a cascaded super-resolution network that upscales the lowresolution facial images, and ii) an ensemble of face recognition models that act as identity priors for the super-resolution network during training. Different from most competing super-resolution techniques that rely on a single model for upscaling (even with large magnification factors), our network uses a cascade of multiple SR models that progressively upscale the low-resolution images using steps of 2×. This characteristic allows us to apply supervision signals (target appearances) at different resolutions and incorporate identity constraints at multiple-scales. The proposed C-SRIP model (Cascaded Super Resolution with Identity Priors) is able to upscale (tiny) low-resolution images captured in unconstrained conditions and produce visually convincing results for diverse low-resolution inputs. We rigorously evaluate the proposed model on the Labeled Faces in the Wild (LFW), Helen and CelebA datasets and report superior performance compared to the existing state-of-the-art.
Grm, Klemen; Pernus, Martin; Cluzel, Leo; Scheirer, Walter J.; Dobrisek, Simon; Struc, Vitomir
In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019.
Contemporary face hallucination (FH) models exhibit considerable ability to reconstruct high-resolution (HR) details from low-resolution (LR) face images. This ability is commonly learned from examples of corresponding HR-LR image pairs, created by artificially down-sampling the HR ground truth data. This down-sampling (or degradation) procedure not only defines the characteristics of the LR training data, but also determines the type of image degradations the learned FH models are eventually able to handle. If the image characteristics encountered with real-world LR images differ from the ones seen during training, FH models are still expected to perform well, but in practice may not produce the desired results. In this paper we study this problem and explore the bias introduced into FH models by the characteristics of the training data. We systematically analyze the generalization capabilities of several FH models in various scenarios where the degradation function does not match the training setup and conduct experiments with synthetically downgraded as well as real-life low-quality images. We make several interesting findings that provide insight into existing problems with FH models and point to future research directions.
Klemen, Grm; Simon, Dobrišek; Vitomir, Štruc
In: Proceedings of the Twenty-sixth International Electrotechnical and Computer Science Conference ERK 2017, 2017.
With recent advancements in deep learning and convolutional neural networks (CNNs), face recognition has seen significant performance improvements over the last few years. However, low-resolution images still remain challenging, with CNNs performing relatively poorly compared to humans. One possibility to improve performance in these settings often advocated in the literature is the use of super-resolution (SR). In this paper, we explore the usefulness of SR algorithms for cross-resolution face recognition in experiments on the Labeled Faces in the Wild (LFW) and SCface datasets using four recent deep CNN models. We conduct experiments with synthetically down-sampled images as well as real-life low-resolution imagery captured by surveillance cameras. Our experiments show that image super-resolution can improve face recognition performance considerably on very low-resolution images (of size 24 x 24 or 32 x 32 pixels), when images are artificially down-sampled, but has a lesser (or sometimes even a detrimental) effect with real-life images leaving significant room for further research in this area.