Publications – Laboratory for Machine Intelligence

Hrovatič, Anja; Peer, Peter; Štruc, Vitomir; Emeršič, Žiga

Efficient ear alignment using a two-stack hourglass network Journal Article

In: IET Biometrics , pp. 1-14, 2023, ISSN: 2047-4938.

Abstract | Links | BibTeX | Tags: biometrics, CNN, deep learning, ear, ear alignment, ear recognition

@article{UhljiIETZiga,

title = {Efficient ear alignment using a two-stack hourglass network},

author = {Anja Hrovatič and Peter Peer and Vitomir Štruc and Žiga Emeršič},

url = {https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/bme2.12109},

doi = {10.1049/bme2.12109},

issn = {2047-4938},

year  = {2023},

date = {2023-01-01},

journal = {IET Biometrics },

pages = {1-14},

abstract = {Ear images have been shown to be a reliable modality for biometric recognition with desirable characteristics, such as high universality, distinctiveness, measurability and permanence. While a considerable amount of research has been directed towards ear recognition techniques, the problem of ear alignment is still under-explored in the open literature. Nonetheless, accurate alignment of ear images, especially in unconstrained acquisition scenarios, where the ear appearance is expected to vary widely due to pose and view point variations, is critical for the performance of all downstream tasks, including ear recognition. Here, the authors address this problem and present a framework for ear alignment that relies on a two-step procedure: (i) automatic landmark detection and (ii) fiducial point alignment. For the first (landmark detection) step, the authors implement and train a Two-Stack Hourglass model (2-SHGNet) capable of accurately predicting 55 landmarks on diverse ear images captured in uncontrolled conditions. For the second (alignment) step, the authors use the Random Sample Consensus (RANSAC) algorithm to align the estimated landmark/fiducial points with a pre-defined ear shape (i.e. a collection of average ear landmark positions). The authors evaluate the proposed framework in comprehensive experiments on the AWEx and ITWE datasets and show that the 2-SHGNet model leads to more accurate landmark predictions than competing state-of-the-art models from the literature. Furthermore, the authors also demonstrate that the alignment step significantly improves recognition accuracy with ear images from unconstrained environments compared to unaligned imagery.},

keywords = {biometrics, CNN, deep learning, ear, ear alignment, ear recognition},

pubstate = {published},

tppubtype = {article}

}

Close

Emeršič, Žiga; Sušanj, Diego; Meden, Blaž; Peer, Peter; Štruc, Vitomir

ContexedNet : Context-Aware Ear Detection in Unconstrained Settings Journal Article

In: IEEE Access, pp. 1–17, 2021, ISSN: 2169-3536.

Abstract | Links | BibTeX | Tags: biometrics, contextual information, deep leraning, ear detection, ear recognition, ear segmentation, neural networks, segmentation

@article{ContexedNet_Emersic_2021,

title = {ContexedNet : Context-Aware Ear Detection in Unconstrained Settings},

author = {Žiga Emeršič and Diego Sušanj and Blaž Meden and Peter Peer and Vitomir Štruc},

editor = {ContexedNet : Context-Aware Ear Detection in Unconstrained Settings},

url = {https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9583244},

issn = {2169-3536},

year  = {2021},

date = {2021-10-20},

urldate = {2021-10-20},

journal = {IEEE Access},

pages = {1--17},

abstract = {Ear detection represents one of the key components of contemporary ear recognition systems. While significant progress has been made in the area of ear detection over recent years, most of the improvements are direct results of advances in the field of visual object detection. Only a limited number of techniques presented in the literature are domain--specific and designed explicitly with ear detection in mind. In this paper, we aim to address this gap and present a novel detection approach that does not rely only on general ear (object) appearance, but also exploits contextual information, i.e., face--part locations, to ensure accurate and robust ear detection with images captured in a wide variety of imaging conditions. The proposed approach is based on a Context--aware Ear Detection Network (ContexedNet) and poses ear detection as a semantic image segmentation problem. ContexedNet consists of two processing paths: 1) a context--provider that extracts probability maps corresponding to the locations of facial parts from the input image, and 2) a dedicated ear segmentation model that integrates the computed probability maps into a context--aware segmentation-based ear detection procedure. ContexedNet is evaluated in rigorous experiments on the AWE and UBEAR datasets and shown to ensure competitive performance when evaluated against state--of--the--art ear detection models from the literature. Additionally, because the proposed contextualization is model agnostic, it can also be utilized with other ear detection techniques to improve performance.},

keywords = {biometrics, contextual information, deep leraning, ear detection, ear recognition, ear segmentation, neural networks, segmentation},

pubstate = {published},

tppubtype = {article}

}

Close

Stepec, Dejan; Emersic, Ziga; Peer, Peter; Struc, Vitomir

Constellation-Based Deep Ear Recognition Book Section

In: Jiang, R.; Li, CT.; Crookes, D.; Meng, W.; Rosenberger, C. (Ed.): Deep Biometrics: Unsupervised and Semi-Supervised Learning, Springer, 2020, ISBN: 978-3-030-32582-4.

Abstract | Links | BibTeX | Tags: biometrics, CNN, deep learning, ear recognition, neural networks

Ziga, Emersic; Janez, Krizaj; Vitomir, Struc; Peter, Peer

Deep ear recognition pipeline Book Section

In: Mahmoud, Hassaballah; M., Hosny Khalid (Ed.): Recent advances in computer vision : theories and applications, vol. 804, Springer, 2019, ISBN: 1860-9503.

Abstract | Links | BibTeX | Tags: ear, ear recognition, pipeline

Emeršič, Žiga; Meden, Blaž; Peer, Peter; Štruc, Vitomir

Evaluation and analysis of ear recognition models: performance, complexity and resource requirements Journal Article

In: Neural Computing and Applications, pp. 1–16, 2018, ISBN: 0941-0643.

Abstract | Links | BibTeX | Tags: AWE, AWEx, descriptor methods, ear recognition, extended annotated web ears dataset

@article{emervsivc2018evaluation,

title = {Evaluation and analysis of ear recognition models: performance, complexity and resource requirements},

author = {Žiga Emeršič and Blaž Meden and Peter Peer and Vitomir Štruc},

url = {https://rdcu.be/Os7a},

doi = {https://doi.org/10.1007/s00521-018-3530-1},

isbn = {0941-0643},

year  = {2018},

date = {2018-05-01},

journal = {Neural Computing and Applications},

pages = {1--16},

publisher = {Springer},

abstract = {Ear recognition technology has long been dominated by (local) descriptor-based techniques due to their formidable recognition performance and robustness to various sources of image variability. While deep-learning-based techniques have started to appear in this field only recently, they have already shown potential for further boosting the performance of ear recognition technology and dethroning descriptor-based methods as the current state of the art. However, while recognition performance is often the key factor when selecting recognition models for biometric technology, it is equally important that the behavior of the models is understood and their sensitivity to different covariates is known and well explored. Other factors, such as the train- and test-time complexity or resource requirements, are also paramount and need to be consider when designing recognition systems. To explore these issues, we present in this paper a comprehensive analysis of several descriptor- and deep-learning-based techniques for ear recognition. Our goal is to discover weak points of contemporary techniques, study the characteristics of the existing technology and identify open problems worth exploring in the future. We conduct our analysis through identification experiments on the challenging Annotated Web Ears (AWE) dataset and report our findings. The results of our analysis show that the presence of accessories and high degrees of head movement significantly impacts the identification performance of all types of recognition models, whereas mild degrees of the listed factors and other covariates such as gender and ethnicity impact the identification performance only to a limited extent. From a test-time-complexity point of view, the results suggest that lightweight deep models can be equally fast as descriptor-based methods given appropriate computing hardware, but require significantly more resources during training, where descriptor-based methods have a clear advantage. As an additional contribution, we also introduce a novel dataset of ear images, called AWE Extended (AWEx), which we collected from the web for the training of the deep models used in our experiments. AWEx contains 4104 images of 346 subjects and represents one of the largest and most challenging (publicly available) datasets of unconstrained ear images at the disposal of the research community.},

keywords = {AWE, AWEx, descriptor methods, ear recognition, extended annotated web ears dataset},

pubstate = {published},

tppubtype = {article}

}

Close

Ear recognition technology has long been dominated by (local) descriptor-based techniques due to their formidable recognition performance and robustness to various sources of image variability. While deep-learning-based techniques have started to appear in this field only recently, they have already shown potential for further boosting the performance of ear recognition technology and dethroning descriptor-based methods as the current state of the art. However, while recognition performance is often the key factor when selecting recognition models for biometric technology, it is equally important that the behavior of the models is understood and their sensitivity to different covariates is known and well explored. Other factors, such as the train- and test-time complexity or resource requirements, are also paramount and need to be consider when designing recognition systems. To explore these issues, we present in this paper a comprehensive analysis of several descriptor- and deep-learning-based techniques for ear recognition. Our goal is to discover weak points of contemporary techniques, study the characteristics of the existing technology and identify open problems worth exploring in the future. We conduct our analysis through identification experiments on the challenging Annotated Web Ears (AWE) dataset and report our findings. The results of our analysis show that the presence of accessories and high degrees of head movement significantly impacts the identification performance of all types of recognition models, whereas mild degrees of the listed factors and other covariates such as gender and ethnicity impact the identification performance only to a limited extent. From a test-time-complexity point of view, the results suggest that lightweight deep models can be equally fast as descriptor-based methods given appropriate computing hardware, but require significantly more resources during training, where descriptor-based methods have a clear advantage. As an additional contribution, we also introduce a novel dataset of ear images, called AWE Extended (AWEx), which we collected from the web for the training of the deep models used in our experiments. AWEx contains 4104 images of 346 subjects and represents one of the largest and most challenging (publicly available) datasets of unconstrained ear images at the disposal of the research community.

Close

Emeršič, Žiga; Playa, Nil Oleart; Štruc, Vitomir; Peer, Peter

Towards Accessories-Aware Ear Recognition Proceedings Article

In: 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), pp. 1–8, IEEE 2018.

Abstract | Links | BibTeX | Tags: accessories, biometrics, ear recognition

@inproceedings{emervsivc2018towards,

title = {Towards Accessories-Aware Ear Recognition},

author = {Žiga Emeršič and Nil Oleart Playa and Vitomir Štruc and Peter Peer},

url = {https://lmi.fe.uni-lj.si/wp-content/uploads/2019/08/iwobi-2018-inpaint-1.pdf},

doi = {10.1109/IWOBI.2018.8464138},

year  = {2018},

date = {2018-03-01},

booktitle = {2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI)},

pages = {1--8},

organization = {IEEE},

abstract = {Automatic ear recognition is gaining popularity within the research community due to numerous desirable properties, such as high recognition performance, the possibility of capturing ear images at a distance and in a covert manner, etc. Despite this popularity and the corresponding research effort that is being directed towards ear recognition technology, open problems still remain. One of the most important issues stopping ear recognition systems from being widely available are ear occlusions and accessories. Ear accessories not only mask biometric features and by this reduce the overall recognition performance, but also introduce new non-biometric features that can be exploited for spoofing purposes. Ignoring ear accessories during recognition can, therefore, present a security threat to ear recognition and also adversely affect performance. Despite the importance of this topic there has been, to the best of our knowledge, no ear recognition studies that would address these problems. In this work we try to close this gap and study the impact of ear accessories on the recognition performance of several state-of-the-art ear recognition techniques. We consider ear accessories as a tool for spoofing attacks and show that CNN-based recognition approaches are more susceptible to spoofing attacks than traditional descriptor-based approaches. Furthermore, we demonstrate that using inpainting techniques or average coloring can mitigate the problems caused by ear accessories and slightly outperforms (standard) black color to mask ear accessories.},

keywords = {accessories, biometrics, ear recognition},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Emeršič, Žiga; Štepec, Dejan; Štruc, Vitomir; Peer, Peter; George, Anjith; Ahmad, Adii; Omar, Elshibani; Boult, Terrance E.; Safdaii, Reza; Zhou, Yuxiang; others Stefanos Zafeiriou,; Yaman, Dogucan; Eyoikur, Fevziye I.; Ekenel, Hazim K.

The unconstrained ear recognition challenge Proceedings Article

In: 2017 IEEE International Joint Conference on Biometrics (IJCB), pp. 715–724, IEEE 2017.

Abstract | Links | BibTeX | Tags: biometrics, competition, ear recognition, IJCB, uerc, unconstrained ear recognition challenge

Emeršič, Žiga; Štepec, Dejan; Štruc, Vitomir; Peer, Peter

Training convolutional neural networks with limited training data for ear recognition in the wild Proceedings Article

In: IEEE International Conference on Automatic Face and Gesture Recognition, Workshop on Biometrics in the Wild 2017, 2017.

Abstract | Links | BibTeX | Tags: CNN, convolutional neural networks, ear, ear recognition, limited data, model learning

@inproceedings{emervsivc2017training,

title = {Training convolutional neural networks with limited training data for ear recognition in the wild},

author = {Žiga Emeršič and Dejan Štepec and Vitomir Štruc and Peter Peer},

url = {https://arxiv.org/pdf/1711.09952.pdf},

year  = {2017},

date = {2017-05-01},

booktitle = {IEEE International Conference on Automatic Face and Gesture Recognition, Workshop on Biometrics in the Wild 2017},

journal = {arXiv preprint arXiv:1711.09952},

abstract = {Identity recognition from ear images is an active field of research within the biometric community. The ability to capture ear images from a distance and in a covert manner makes ear recognition technology an appealing choice for surveillance and security applications as well as related application domains. In contrast to other biometric modalities, where large datasets captured in uncontrolled settings are readily available, datasets of ear images are still limited in size and mostly of laboratory-like quality. As a consequence, ear recognition technology has not benefited yet from advances in deep learning and convolutional neural networks (CNNs) and is still lacking behind other modalities that experienced significant performance gains owing to deep recognition technology. In this paper we address this problem and aim at building a CNNbased ear recognition model. We explore different strategies towards model training with limited amounts of training data and show that by selecting an appropriate model architecture, using aggressive data augmentation and selective learning on existing (pre-trained) models, we are able to learn an effective CNN-based model using a little more than 1300 training images. The result of our work is the first CNN-based approach to ear recognition that is also made publicly available to the research community. With our model we are able to improve on the rank one recognition rate of the previous state-of-the-art by more than 25% on a challenging dataset of ear images captured from the web (a.k.a. in the wild).},

keywords = {CNN, convolutional neural networks, ear, ear recognition, limited data, model learning},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Emeršič, Žiga; Štruc, Vitomir; Peer, Peter

Ear recognition: More than a survey Journal Article

In: Neurocomputing, vol. 255, pp. 26–39, 2017.

Abstract | Links | BibTeX | Tags: AWE, biometrics, dataset, ear, ear recognition, performance evalution, survey, toolbox

Ribič, Metod; Emeršič, Žiga; Štruc, Vitomir; Peer, Peter

Influence of alignment on ear recognition : case study on AWE Dataset Proceedings Article

In: Proceedings of the Electrotechnical and Computer Science Conference (ERK), pp. 131-134, Portorož, Slovenia, 2016.

Abstract | Links | BibTeX | Tags: AWE, AWE dataset, biometrics, ear alignment, ear recognition, image alignment, Ransac, SIFT