2016 |
Ribič, Metod; Emeršič, Žiga; Štruc, Vitomir; Peer, Peter Influence of alignment on ear recognition : case study on AWE Dataset Proceedings Article V: Proceedings of the Electrotechnical and Computer Science Conference (ERK), str. 131-134, Portorož, Slovenia, 2016. Povzetek | Povezava | BibTeX | Oznake: AWE, AWE dataset, biometrics, ear alignment, ear recognition, image alignment, Ransac, SIFT @inproceedings{RibicERK2016,Ear as a biometric modality presents a viable source for automatic human recognition. In recent years local description methods have been gaining on popularity due to their invariance to illumination and occlusion. However, these methods require that images are well aligned and preprocessed as good as possible. This causes one of the greatest challenges of ear recognition: sensitivity to pose variations. Recently, we presented Annotated Web Ears dataset that opens new challenges in ear recognition. In this paper we test the influence of alignment on recognition performance and prove that even with the alignment the database is still very challenging, even-though the recognition rate is improved due to alignment. We also prove that more sophisticated alignment methods are needed to address the AWE dataset efficiently |
Dobrišek, Simon; Čefarin, David; Štruc, Vitomir; Mihelič, France Preizkus Googlovega govornega programskega vmesnika pri samodejnem razpoznavanju govorjene slovenščine Proceedings Article V: Jezikovne tehnologije in digitalna humanistika, str. 47-51, 2016. Povzetek | Povezava | BibTeX | Oznake: @inproceedings{dobrivsekpreizkus,Automatic speech recognizers are slowly maturing into technologies that enable humans to communicate more naturally and effectively with a variety of smart devices and information-communication systems. Large global companies such as Google, Microsoft, Apple, IBM and Baidu compete in developing the most reliable speech recognizers, supporting as many of the main world languages as possible. Due to the relatively small number of speakers, the support for the Slovenian spoken language is lagging behind, and among the major global companies only Google has recently supported our spoken language. The paper presents the results of our independent assessment of the Google speech-application programming interface for automatic Slovenian speech recognition. For the experiments, we used speech databases that are otherwise used for the development and assessment of Slovenian speech recognizers. |
Kravanja, Jaka; Žganec, Mario; Žganec-Gros, Jerneja; Dobrišek, Simon; Štruc, Vitomir Exploiting Spatio-Temporal Information for Light-Plane Labeling in Depth-Image Sensors Using Probabilistic Graphical Models Članek v strokovni reviji V: Informatica, vol. 27, no. 1, str. 67–84, 2016. Povzetek | Povezava | BibTeX | Oznake: 3d imaging, correspondance, depth imaging, depth sensing, depth sensor, graphical models, sensor, structured light @article{kravanja2016exploiting,This paper proposes a novel approach to light plane labeling in depth-image sensors relying on “uncoded” structured light. The proposed approach adopts probabilistic graphical models (PGMs) to solve the correspondence problem between the projected and the detected light patterns. The procedure for solving the correspondence problem is designed to take the spatial relations between the parts of the projected pattern and prior knowledge about the structure of the pattern into account, but it also exploits temporal information to achieve reliable light-plane labeling. The procedure is assessed on a database of light patterns detected with a specially developed imaging sensor that, unlike most existing solutions on the market, was shown to work reliably in outdoor environments as well as in the presence of other identical (active) sensors directed at the same scene. The results of our experiments show that the proposed approach is able to reliably solve the correspondence problem and assign light-plane labels to the detected pattern with a high accuracy, even when large spatial discontinuities are present in the observed scene. |
Grm, Klemen; Dobrišek, Simon; Štruc, Vitomir Deep pair-wise similarity learning for face recognition Proceedings Article V: 4th International Workshop on Biometrics and Forensics (IWBF), str. 1–6, IEEE 2016. Povzetek | Povezava | BibTeX | Oznake: CNN, deep learning, face recognition, IJB-A, IWBF, performance evaluation, similarity learning @inproceedings{grm2016deep,Recent advances in deep learning made it possible to build deep hierarchical models capable of delivering state-of-the-art performance in various vision tasks, such as object recognition, detection or tracking. For recognition tasks the most common approach when using deep models is to learn object representations (or features) directly from raw image-input and then feed the learned features to a suitable classifier. Deep models used in this pipeline are typically heavily parameterized and require enormous amounts of training data to deliver competitive recognition performance. Despite the use of data augmentation techniques, many application domains, predefined experimental protocols or specifics of the recognition problem limit the amount of available training data and make training an effective deep hierarchical model a difficult task. In this paper, we present a novel, deep pair-wise similarity learning (DPSL) strategy for deep models, developed specifically to overcome the problem of insufficient training data, and demonstrate its usage on the task of face recognition. Unlike existing (deep) learning strategies, DPSL operates on image-pairs and tries to learn pair-wise image similarities that can be used for recognition purposes directly instead of feature representations that need to be fed to appropriate classification techniques, as with traditional deep learning pipelines. Since our DPSL strategy assumes an image pair as the input to the learning procedure, the amount of training data available to train deep models is quadratic in the number of available training images, which is of paramount importance for models with a large number of parameters. We demonstrate the efficacy of the proposed learning strategy by developing a deep model for pose-invariant face recognition, called Pose-Invariant Similarity Index (PISI), and presenting comparative experimental results on the FERET an IJB-A datasets. |
Golob, Žiga; Gros, Jerneja Žganec; Štruc, Vitomir; Mihelič, France; Dobrišek, Simon A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion Proceedings Article V: International Conference on Text, Speech, and Dialogue, str. 375–382, Springer 2016. @inproceedings{golob2016composition,Minimal deterministic finite-state transducers (MDFSTs) are powerful models that can be used to represent pronunciation dictionaries in a compact form. Intuitively, we would assume that by increasing the size of the dictionary, the size of the MDFSTs would increase as well. However, as we show in the paper, this intuition does not hold for highly inflected languages. With such languages the size of the MDFSTs begins to decrease once the number of words in the represented dictionary reaches a certain threshold. Motivated by this observation, we have developed a new type of FST, called a finite-state super transducer (FSST), and show experimentally that the FSST is capable of representing pronunciation dictionaries with fewer states and transitions than MDFSTs. Furthermore, we show that (unlike MDFSTs) our FSSTs can also accept words that are not part of the represented dictionary. The phonetic transcriptions of these out-of-dictionary words may not always be correct, but the observed error rates are comparable to the error rates of the traditional methods for grapheme-to-phoneme conversion. |
2015 |
Grm, Klemen; Dobrišek, Simon; Štruc, Vitomir The pose-invariant similarity index for face recognition Proceedings Article V: Proceedings of the Electrotechnical and Computer Science Conference (ERK), Portorož, Slovenia, 2015. BibTeX | Oznake: biometrics, CNN, deep learning, deep models, face verification, similarity learning @inproceedings{ERK2015Klemen, |
Štruc, Vitomir; Križaj, Janez; Dobrišek, Simon Modest face recognition Proceedings Article V: Proceedings of the International Workshop on Biometrics and Forensics (IWBF), str. 1–6, IEEE, 2015. Povzetek | Povezava | BibTeX | Oznake: biometrics, face verification, Gabor features, image descriptors, LBP, multi modality, PaSC, performance evaluation @inproceedings{struc2015modest,The facial imagery usually at the disposal for forensics investigations is commonly of a poor quality due to the unconstrained settings in which it was acquired. The captured faces are typically non-frontal, partially occluded and of a low resolution, which makes the recognition task extremely difficult. In this paper we try to address this problem by presenting a novel framework for face recognition that combines diverse features sets (Gabor features, local binary patterns, local phase quantization features and pixel intensities), probabilistic linear discriminant analysis (PLDA) and data fusion based on linear logistic regression. With the proposed framework a matching score for the given pair of probe and target images is produced by applying PLDA on each of the four feature sets independently - producing a (partial) matching score for each of the PLDA-based feature vectors - and then combining the partial matching results at the score level to generate a single matching score for recognition. We make two main contributions in the paper: i) we introduce a novel framework for face recognition that relies on probabilistic MOdels of Diverse fEature SeTs (MODEST) to facilitate the recognition process and ii) benchmark it against the existing state-of-the-art. We demonstrate the feasibility of our MODEST framework on the FRGCv2 and PaSC databases and present comparative results with the state-of-the-art recognition techniques, which demonstrate the efficacy of our framework. |
Beveridge, Ross; Zhang, Hao; Draper, Bruce A; Flynn, Patrick J; Feng, Zhenhua; Huber, Patrik; Kittler, Josef; Huang, Zhiwu; Li, Shaoxin; Li, Yan; Štruc, Vitomir; Križaj, Janez; others, Report on the FG 2015 video person recognition evaluation Proceedings Article V: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (IEEE FG), str. 1–8, IEEE 2015. Povzetek | Povezava | BibTeX | Oznake: biometrics, competition, face verification, FG, group evaluation, PaSC, performance evaluation @inproceedings{beveridge2015report,This report presents results from the Video Person Recognition Evaluation held in conjunction with the 11th IEEE International Conference on Automatic Face and Gesture Recognition. Two experiments required algorithms to recognize people in videos from the Point-and-Shoot Face Recognition Challenge Problem (PaSC). The first consisted of videos from a tripod mounted high quality video camera. The second contained videos acquired from 5 different handheld video cameras. There were 1401 videos in each experiment of 265 subjects. The subjects, the scenes, and the actions carried out by the people are the same in both experiments. Five groups from around the world participated in the evaluation. The video handheld experiment was included in the International Joint Conference on Biometrics (IJCB) 2014 Handheld Video Face and Person Recognition Competition. The top verification rate from this evaluation is double that of the top performer in the IJCB competition. Analysis shows that the factor most effecting algorithm performance is the combination of location and action: where the video was acquired and what the person was doing. |
Justin, Tadej; Štruc, Vitomir; Dobrišek, Simon; Vesnicer, Boštjan; Ipšić, Ivo; Mihelič, France Speaker de-identification using diphone recognition and speech synthesis Proceedings Article V: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (IEEE FG): DeID 2015, str. 1–7, IEEE 2015. Povzetek | Povezava | BibTeX | Oznake: DEID, FG, speech deidentification, speech recognition, speech synthesis, speech technologies @inproceedings{justin2015speaker,The paper addresses the problem of speaker (or voice) de-identification by presenting a novel approach for concealing the identity of speakers in their speech. The proposed technique first recognizes the input speech with a diphone recognition system and then transforms the obtained phonetic transcription into the speech of another speaker with a speech synthesis system. Due to the fact that a Diphone RecOgnition step and a sPeech SYnthesis step are used during the deidentification, we refer to the developed technique as DROPSY. With this approach the acoustical models of the recognition and synthesis modules are completely independent from each other, which ensures the highest level of input speaker deidentification. The proposed DROPSY-based de-identification approach is language dependent, text independent and capable of running in real-time due to the relatively simple computing methods used. When designing speaker de-identification technology two requirements are typically imposed on the deidentification techniques: i) it should not be possible to establish the identity of the speakers based on the de-identified speech, and ii) the processed speech should still sound natural and be intelligible. This paper, therefore, implements the proposed DROPSY-based approach with two different speech synthesis techniques (i.e, with the HMM-based and the diphone TDPSOLA- based technique). The obtained de-identified speech is evaluated for intelligibility and evaluated in speaker verification experiments with a state-of-the-art (i-vector/PLDA) speaker recognition system. The comparison of both speech synthesis modules integrated in the proposed method reveals that both can efficiently de-identify the input speakers while still producing intelligible speech. |
Dobrišek, Simon; Štruc, Vitomir; Križaj, Janez; Mihelič, France Face recognition in the wild with the Probabilistic Gabor-Fisher Classifier Proceedings Article V: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (IEEE FG): BWild 2015, str. 1–6, IEEE 2015. Povzetek | Povezava | BibTeX | Oznake: biometrics, BWild, FG, Gabor features, PaSC, plda, probabilistic Gabor Fisher classifier, probabilistic linear discriminant analysis @inproceedings{dobrivsek2015face,The paper addresses the problem of face recognition in the wild. It introduces a novel approach to unconstrained face recognition that exploits Gabor magnitude features and a simplified version of the probabilistic linear discriminant analysis (PLDA). The novel approach, named Probabilistic Gabor-Fisher Classifier (PGFC), first extracts a vector of Gabor magnitude features from the given input image using a battery of Gabor filters, then reduces the dimensionality of the extracted feature vector by projecting it into a low-dimensional subspace and finally produces a representation suitable for identity inference by applying PLDA to the projected feature vector. The proposed approach extends the popular Gabor-Fisher Classifier (GFC) to a probabilistic setting and thus improves on the generalization capabilities of the GFC method. The PGFC technique is assessed in face verification experiments on the Point and Shoot Face Recognition Challenge (PaSC) database, which features real-world videos of subjects performing everyday tasks. Experimental results on this challenging database show the feasibility of the proposed approach, which improves on the best results on this database reported in the literature by the time of writing. |
Justin, Tadej; Štruc, Vitomir; Žibert, Janez; Mihelič, France Development and Evaluation of the Emotional Slovenian Speech Database-EmoLUKS Proceedings Article V: Proceedings of the International Conference on Text, Speech, and Dialogue (TSD), str. 351–359, Springer 2015. Povzetek | Povezava | BibTeX | Oznake: annotated data, dataset, dataset of emotional speech, EmoLUKS, emotional speech synthesis, speech synthesis, speech technologies, transcriptions @inproceedings{justin2015development,This paper describes a speech database built from 17 Slovenian radio dramas. The dramas were obtained from the national radio-and-television station (RTV Slovenia) and were given at the universities disposal with an academic license for processing and annotating the audio material. The utterances of one male and one female speaker were transcribed, segmented and then annotated with emotional states of the speakers. The annotation of the emotional states was conducted in two stages with our own web-based application for crowd sourcing. The final (emotional) speech database consists of 1385 recordings of one male (975 recordings) and one female (410 recordings) speaker and contains labeled emotional speech with a total duration of around 1 hour and 15 minutes. The paper presents the two-stage annotation process used to label the data and demonstrates the usefulness of the employed annotation methodology. Baseline emotion recognition experiments are also presented. The reported results are presented with the un-weighted as well as weighted average recalls and precisions for 2-class and 7-class recognition experiments. |
Camgoz, Necati Cihan; Štruc, Vitomir; Gokberk, Berk; Akarun, Lale; Kindiroglu, Ahmet Alp Facial Landmark Localization in Depth Images using Supervised Ridge Descent Proceedings Article V: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW): Chaa Learn, str. 136–141, 2015. Povzetek | Povezava | BibTeX | Oznake: 3d landmarking, facial landmarking, landmark localization, landmarking, ridge regression, SDM @inproceedings{cihan2015facial,Supervised Descent Method (SDM) has proven successful in many computer vision applications such as face alignment, tracking and camera calibration. Recent studies which used SDM, achieved state of the-art performance on facial landmark localization in depth images [4]. In this study, we propose to use ridge regression instead of least squares regression for learning the SDM, and to change feature sizes in each iteration, effectively turning the landmark search into a coarse to fine process. We apply the proposed method to facial landmark localization on the Bosphorus 3D Face Database; using frontal depth images with no occlusion. Experimental results confirm that both ridge regression and using adaptive feature sizes improve the localization accuracy considerably |
Murovec, Boštjan Job-shop local-search move evaluation without direct consideration of the criterion’s value Članek v strokovni reviji V: European Journal of Operational Research, vol. 241, no. 2, str. 320 - 329, 2015, ISSN: 0377-2217. Povzetek | Povezava | BibTeX | Oznake: Job-shop, Local search, Makespan, Move evaluation, Scheduling @article{MUROVEC2015320,This article focuses on the evaluation of moves for the local search of the job-shop problem with the makespan criterion. We reason that the omnipresent ranking of moves according to their resulting value of a criterion function makes the local search unnecessarily myopic. Consequently, we introduce an alternative evaluation that relies on a surrogate quantity of the move’s potential, which is related to, but not strongly coupled with, the bare criterion. The approach is confirmed by empirical tests, where the proposed evaluator delivers a new upper bound on the well-known benchmark test yn2. The line of the argumentation also shows that by sacrificing accuracy the established makespan estimators unintentionally improve on the move evaluation in comparison to the exact makespan calculation, in contrast to the belief that the reliance on estimation degrades the optimization results. |
Murovec, Boštjan; Kolbl, Sabina; Stres, Blaž Methane Yield Database: Online infrastructure and bioresource for methane yield data and related metadata Članek v strokovni reviji V: Bioresource Technology, vol. 189, str. 217 - 223, 2015, ISSN: 0960-8524. Povzetek | Povezava | BibTeX | Oznake: Batch, Biogas, Industry, Infrastructure, Methane yield database @article{MUROVEC2015217,The aim of this study was to develop and validate a community supported online infrastructure and bioresource for methane yield data and accompanying metadata collected from published literature. In total, 1164 entries described by 15,749 data points were assembled. Analysis of data collection showed little congruence in reporting of methodological approaches. The largest identifiable source of variation in reported methane yields was represented by authorship (i.e. substrate batches within particular substrate class) within which experimental scales (volumes (0.02–5l), incubation temperature (34–40°C) and % VS of substrate played an important role (p<0.0 |
Henderson, Gemma; Cox, Faith; Ganesh, Siva; Jonker, Arjan; Young, Wayne; Janssen, Peter H Rumen microbial community composition varies with diet and host, but a core microbiome is found across a wide geographical range Članek v strokovni reviji V: Scientific reports, vol. art 14567, no. 5, str. 1–13, 2015, ISSN: 2045-2322. @article{Henderson_Cox_Ganesh_Jonker_Young_Janssen_2015, |
2014 |
Peer, Peter; Emeršič, Žiga; Bule, Jernej; Žganec-Gros, Jerneja; Štruc, Vitomir Strategies for exploiting independent cloud implementations of biometric experts in multibiometric scenarios Članek v strokovni reviji V: Mathematical problems in engineering, vol. 2014, 2014. Povzetek | Povezava | BibTeX | Oznake: application, biometrics, cloud computing, face recognition, fingerprint recognition, fusion @article{peer2014strategies,Cloud computing represents one of the fastest growing areas of technology and offers a new computing model for various applications and services. This model is particularly interesting for the area of biometric recognition, where scalability, processing power, and storage requirements are becoming a bigger and bigger issue with each new generation of recognition technology. Next to the availability of computing resources, another important aspect of cloud computing with respect to biometrics is accessibility. Since biometric cloud services are easily accessible, it is possible to combine different existing implementations and design new multibiometric services that next to almost unlimited resources also offer superior recognition performance and, consequently, ensure improved security to its client applications. Unfortunately, the literature on the best strategies of how to combine existing implementations of cloud-based biometric experts into a multibiometric service is virtually nonexistent. In this paper, we try to close this gap and evaluate different strategies for combining existing biometric experts into a multibiometric cloud service. We analyze the (fusion) strategies from different perspectives such as performance gains, training complexity, or resource consumption and present results and findings important to software developers and other researchers working in the areas of biometrics and cloud computing. The analysis is conducted based on two biometric cloud services, which are also presented in the paper. |
Štruc, Vitomir; Žganec-Gros, Jerneja; Vesnicer, Boštjan; Pavešić, Nikola Beyond parametric score normalisation in biometric verification systems Članek v strokovni reviji V: IET Biometrics, vol. 3, no. 2, str. 62–74, 2014. Povzetek | Povezava | BibTeX | Oznake: biometrics, face verification, hybrid score normalization, score normalization, t-norm, tz-norm, z-norm, zt-norm @article{struc2014beyond,Similarity scores represent the basis for identity inference in biometric verification systems. However, because of the so-called miss-matched conditions across enrollment and probe samples and identity-dependent factors these scores typically exhibit statistical variations that affect the verification performance of biometric systems. To mitigate these variations, scorenormalisation techniques, such as the z-norm, the t-norm or the zt-norm, are commonly adopted. In this study, the authors study the problem of score normalisation in the scope of biometric verification and introduce a new class of non-parametric normalisation techniques, which make no assumptions regarding the shape of the distribution from which the scores are drawn (as the parametric techniques do). Instead, they estimate the shape of the score distribution and use the estimate to map the initial distribution to a common (predefined) distribution. Based on the new class of normalisation techniques they also develop a hybrid normalisation scheme that combines non-parametric and parametric techniques into hybrid two-step procedures. They evaluate the performance of the non-parametric and hybrid techniques in face-verification experiments on the FRGCv2 and SCFace databases and show that the non-parametric techniques outperform their parametric counterparts and that the hybrid procedure is not only feasible, but also retains some desirable characteristics from both the non-parametric and the parametric techniques. |
Emeršič, Žiga; Bule, Jernej; Žganec-Gros, Jerneja; Štruc, Vitomir; Peer, Peter A case study on multi-modal biometrics in the cloud Članek v strokovni reviji V: Electrotechnical Review, vol. 81, no. 3, str. 74, 2014. Povzetek | Povezava | BibTeX | Oznake: cloud, cloud computing, face recognition, face verification, fingerprint verification, fingerprints, fusion @article{emersic2014case,Cloud computing is particularly interesting for the area of biometric recognition, where scalability, availability and accessibility are important aspects. In this paper we try to evaluate different strategies for combining existing uni-modal (cloud-based) biometric experts into a multi-biometric cloud-service. We analyze several fusion strategies from the perspective of performance gains, training complexity and resource consumption and discuss the results of our analysis. The experimental evaluation is conducted based on two biometric cloud-services developed in the scope of the Competence Centere CLASS, a face recognition service and a fingerprint recognition service, which are also briefly described in the paper. The presented results are important to researchers and developers working in the area of biometric services for the cloud looking for easy solutions for improving the quality of their services. |
Križaj, Janez; Štruc, Vitomir; Mihelič, France A Feasibility Study on the Use of Binary Keypoint Descriptors for 3D Face Recognition Proceedings Article V: Proceedings of the Mexican Conference on Pattern Recognition (MCPR), str. 142–151, Springer 2014. Povzetek | Povezava | BibTeX | Oznake: 3d face recognition, binary descriptors, biometrics, BRISK, CASIA, face verification, FREAK, FRGC, MCPR, ORB, performance evaluation, SIFT, SURF @inproceedings{krivzaj2014feasibility,Despite the progress made in the area of local image descriptors in recent years, virtually no literature is available on the use of more recent descriptors for the problem of 3D face recognition, such as BRIEF, ORB, BRISK or FREAK, which are binary in nature and, therefore, tend to be faster to compute and match, while requiring signicantly less memory for storage than, for example, SIFT or SURF. In this paper, we try to close this gap and present a feasibility study on the use of these descriptors for 3D face recognition. Descriptors are evaluated on the three challenging 3D face image datasets, namely, the FRGC, UMB and CASIA. Our experiments show the binary descriptors ensure slightly lower verication rates than SIFT, comparable to those of the SURF descriptor, while being an order of magnitude faster than SIFT. The results suggest that the use of binary descriptors represents a viable alternative to the established descriptors. |
Križaj, Janez; Štruc, Vitomir; Dobrišek, Simon; Marčetić, Darijan; Ribarić, Slobodan SIFT vs. FREAK: Assessing the usefulness of two keypoint descriptors for 3D face verification Proceedings Article V: 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), str. 1336–1341, Mipro Opatija, Croatia, 2014. Povzetek | Povezava | BibTeX | Oznake: 3d face recognition, binary descriptors, face recognition, FREAK, performance comparison, performance evaluation, SIFT @inproceedings{krivzaj2014sift,Many techniques in the area of 3D face recognition rely on local descriptors to characterize the surface-shape information around points of interest (or keypoints) in the 3D images. Despite the fact that a lot of advancements have been made in the area of keypoint descriptors over the last years, the literature on 3D-face recognition for the most part still focuses on established descriptors, such as SIFT and SURF, and largely neglects more recent descriptors, such as the FREAK descriptor. In this paper we try to bridge this gap and assess the usefulness of the FREAK descriptor for the task for 3D face recognition. Of particular interest to us is a direct comparison of the FREAK and SIFT descriptors within a simple verification framework. To evaluate our framework with the two descriptors, we conduct 3D face recognition experiments on the challenging FRGCv2 and UMBDB databases and show that the FREAK descriptor ensures a very competitive verification performance when compared to the SIFT descriptor, but at a fraction of the computational cost. Our results indicate that the FREAK descriptor is a viable alternative to the SIFT descriptor for the problem of 3D face verification and due to its binary nature is particularly useful for real-time recognition systems and verification techniques for low-resource devices such as mobile phones, tablets and alike. |
Marčetić, Darijan; Ribarić, Slobodan; Štruc, Vitomir; Pavešić, Nikola An experimental tattoo de-identification system for privacy protection in still images Proceedings Article V: 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), str. 1288–1293, Mipro IEEE, 2014. Povzetek | Povezava | BibTeX | Oznake: computer vision, deidentification, MIPRO, privacy protection, tattoo deidentification @inproceedings{marcetic2014experimental,An experimental tattoo de-identification system for privacy protection in still images is described in the paper. The system consists of the following modules: skin detection, region of interest detection, feature extraction, tattoo database, matching, tattoo detection, skin swapping, and quality evaluation. Two methods for tattoo localization are presented. The first is a simple ad-hoc method based only on skin colour. The second is based on skin colour, texture and SIFT features. The appearance of each tattoo area is de-identified in such a way that its skin colour and skin texture are similar to the surrounding skin area. Experimental results for still images in which tattoo location, distance, size, illumination, and motion blur have large variability are presented. The system is subjectively evaluated based on the results of tattoo localization, the level of privacy protection and the naturalness of the de-identified still images. The level of privacy protection is estimated based on the quality of the removal of the tattoo appearance and the concealment of its location. |
Vesnicer, Boštjan; Žganec-Gros, Jerneja; Dobrišek, Simon; Štruc, Vitomir Incorporating Duration Information into I-Vector-Based Speaker-Recognition Systems Proceedings Article V: Proceedings of Odyssey: The Speaker and Language Recognition Workshop, str. 241–248, 2014. Povzetek | Povezava | BibTeX | Oznake: acustic features, biometrics, duration, duration modeling, i-vector, i-vector challenge, Odyssey, performance evaluation, speaker recognition, speech technologies @inproceedings{vesnicer2014incorporating,Most of the existing literature on i-vector-based speaker recognition focuses on recognition problems, where i-vectors are extracted from speech recordings of sufficient length. The majority of modeling/recognition techniques therefore simply ignores the fact that the i-vectors are most likely estimated unreliably when short recordings are used for their computation. Only recently, were a number of solutions proposed in the literature to address the problem of duration variability, all treating the i-vector as a random variable whose posterior distribution can be parameterized by the posterior mean and the posterior covariance. In this setting the covariance matrix serves as a measure of uncertainty that is related to the length of the available recording. In contract to these solutions, we address the problem of duration variability through weighted statistics. We demonstrate in the paper how established feature transformation techniques regularly used in the area of speaker recognition, such as PCA or WCCN, can be modified to take duration into account. We evaluate our weighting scheme in the scope of the i-vector challenge organized as part of the Odyssey, Speaker and Language Recognition Workshop 2014 and achieve a minimal DCF of 0.280, which at the time of writing puts our approach in third place among all the participating institutions. |
Beveridge, Ross; Zhang, Hao; Flynn, Patrick; Lee, Yooyoung; Liong, Venice Erin; Lu, Jiwen; de Angeloni, Marcus Assis; de Pereira, Tiago Freitas; Li, Haoxiang; Hua, Gang; Štruc, Vitomir; Križaj, Janez; Phillips, Jonathon The ijcb 2014 pasc video face and person recognition competition Proceedings Article V: Proceedings of the IEEE International Joint Conference on Biometrics (IJCB), str. 1–8, IEEE 2014. Povzetek | Povezava | BibTeX | Oznake: biometrics, competition, face recognition, group evaluation, IJCB, PaSC, performance evaluation @inproceedings{beveridge2014ijcb,The Point-and-Shoot Face Recognition Challenge (PaSC) is a performance evaluation challenge including 1401 videos of 265 people acquired with handheld cameras and depicting people engaged in activities with non-frontal head pose. This report summarizes the results from a competition using this challenge problem. In the Video-to-video Experiment a person in a query video is recognized by comparing the query video to a set of target videos. Both target and query videos are drawn from the same pool of 1401 videos. In the Still-to-video Experiment the person in a query video is to be recognized by comparing the query video to a larger target set consisting of still images. Algorithm performance is characterized by verification rate at a false accept rate of 0:01 and associated receiver operating characteristic (ROC) curves. Participants were provided eye coordinates for video frames. Results were submitted by 4 institutions: (i) Advanced Digital Science Center, Singapore; (ii) CPqD, Brasil; (iii) Stevens Institute of Technology, USA; and (iv) University of Ljubljana, Slovenia. Most competitors demonstrated video face recognition performance superior to the baseline provided with PaSC. The results represent the best performance to date on the handheld video portion of the PaSC. |
2013 |
Križaj, Janez; Dobrišek, Simon; Štruc, Vitomir; Pavešić, Nikola Robust 3D face recognition using adapted statistical models Proceedings Article V: Proceedings of the Electrotechnical and Computer Science Conference (ERK'13), 2013. Povzetek | Povezava | BibTeX | Oznake: 3d face recognition, biometrics, covariance descriptor, face verification, FRGC, GMM, modeling, performance evaluation, region-covariance matrix @inproceedings{krizajrobust,The paper presents a novel framework to 3D face recognition that exploits region covariance matrices (RCMs), Gaussian mixture models (GMMs) and support vector machine (SVM) classifiers. The proposed framework first combines several 3D face representations at the feature level using RCM descriptors and then derives low-dimensional feature vectors from the computed descriptors with the unscented transform. By doing so, it enables computations in Euclidean space, and makes Gaussian mixture modeling feasible. Finally, a support vector classifier is used for identity inference. As demonstrated by our experimental results on the FRGCv2 and UMB databases, the proposed framework is highly robust and exhibits desirable characteristics such as an inherent mechanism for data fusion (through the RCMs), the ability to examine local as well as global structures of the face with the same descriptor, the ability to integrate domain-specific prior knowledge into the modeling procedure and consequently to handle missing or unreliable data. |
Štruc, Vitomir; Žganec-Gros, Jerneja; Pavešić, Nikola; Dobrišek, Simon Zlivanje informacij za zanseljivo in robustno razpoznavanje obrazov Članek v strokovni reviji V: Electrotechnical Review, vol. 80, no. 3, str. 1-12, 2013. Povzetek | Povezava | BibTeX | Oznake: biometrics, face recognition, fusion, performance evaluation @article{EV_Struc_2013,The existing face recognition technology has reached a performance level where it is possible to deploy it in various applications providing they are capable of ensuring controlled conditions for the image acquisition procedure. However, the technology still struggles with its recognition performance when deployed in uncontrolled and unconstrained conditions. In this paper, we present a novel approach to face recognition designed specifically for these challenging conditions. The proposed approach exploits information fusion to achieve robustness. In the first step, the approach crops the facial region from each input image in three different ways. It then maps each of the three crops into one of four color representations and finally extracts several feature types from each of the twelve facial representations. The described procedure results in a total of thirty facial representations that are combined at the matching score level using a fusion approach based on linear logistic regression (LLR) to arrive at a robust decision regarding the identity of the subject depicted in the input face image. The presented approach was enlisted as a representative of the University of Ljubljana and Alpineon d.o.o. to the 2013 face-recognition competition that was held in conjunction with the IAPR International Conference on Biometrics and achieved the best overall recognition results among all competition participants. Here, we describe the basic characteristics of the approach, elaborate on the results of the competition and, most importantly, present some interesting findings made during our development work that are also of relevance to the research community working in the field of face recognition. |
Štruc, Vitomir; Gros, Jeneja Žganec; Dobrišek, Simon; Pavešić, Nikola Exploiting representation plurality for robust and efficient face recognition Proceedings Article V: Proceedings of the 22nd Intenational Electrotechnical and Computer Science Conference (ERK'13), str. 121–124, Portorož, Slovenia, 2013. Povzetek | Povezava | BibTeX | Oznake: competition, erk, face recognition, face verification, group evaluation, ICB, mobile biometrics, MOBIO, performance evaluation @inproceedings{ERK2013_Struc,The paper introduces a novel approach to face recognition that exploits plurality of representation to achieve robust face recognition. The proposed approach was submitted as a representative of the University of Ljubljana and Alpineon d.o.o. to the 2013 face recognition competition that was held in conjunction with the IAPR International Conference on Biometrics and achieved the best overall recognition results among all competition participants. Here, we describe the basic characteristics of the submitted approach, elaborate on the results of the competition and, most importantly, present some general findings made during our development work that are of relevance to the broader (face recognition) research community. |
Križaj, Janez; Štruc, Vitomir; Dobrišek, Simon Combining 3D face representations using region covariance descriptors and statistical models Proceedings Article V: Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (IEEE FG), Workshop on 3D Face Biometrics, IEEE, Shanghai, China, 2013. Povzetek | Povezava | BibTeX | Oznake: 3d face recognition, biometrics, covariance descriptors, face recognition, face verification, FG, gaussian mixture models, GMM, unscented transform @inproceedings{FG2013,The paper introduces a novel framework for 3D face recognition that capitalizes on region covariance descriptors and Gaussian mixture models. The framework presents an elegant and coherent way of combining multiple facial representations, while simultaneously examining all computed representations at various levels of locality. The framework first computes a number of region covariance matrices/descriptors from different sized regions of several image representations and then adopts the unscented transform to derive low-dimensional feature vectors from the computed descriptors. By doing so, it enables computations in the Euclidean space, and makes Gaussian mixture modeling feasible. In the last step a support vector machine classification scheme is used to make a decision regarding the identity of the modeled input 3D face image. The proposed framework exhibits several desirable characteristics, such as an inherent mechanism for data fusion/integration (through the region covariance matrices), the ability to examine the facial images at different levels of locality, and the ability to integrate domain-specific prior knowledge into the modeling procedure. We assess the feasibility of the proposed framework on the Face Recognition Grand Challenge version 2 (FRGCv2) database with highly encouraging results. |
Dobrišek, Simon; Gajšek, Rok; Mihelič, France; Pavešić, Nikola; Štruc, Vitomir Towards efficient multi-modal emotion recognition Članek v strokovni reviji V: International Journal of Advanced Robotic Systems, vol. 10, no. 53, 2013. Povzetek | Povezava | BibTeX | Oznake: avid database, emotion recognition, facial expression recognition, multi modality, speech technologies @article{dobrivsek2013towards,The paper presents a multi-modal emotion recognition system exploiting audio and video (i.e., facial expression) information. The system first processes both sources of information individually to produce corresponding matching scores and then combines the computed matching scores to obtain a classification decision. For the video part of the system, a novel approach to emotion recognition, relying on image-set matching, is developed. The proposed approach avoids the need for detecting and tracking specific facial landmarks throughout the given video sequence, which represents a common source of error in video-based emotion recognition systems, and, therefore, adds robustness to the video processing chain. The audio part of the system, on the other hand, relies on utterance-specific Gaussian Mixture Models (GMMs) adapted from a Universal Background Model (UBM) via the maximum a posteriori probability (MAP) estimation. It improves upon the standard UBM-MAP procedure by exploiting gender information when building the utterance-specific GMMs, thus ensuring enhanced emotion recognition performance. Both the uni-modal parts as well as the combined system are assessed on the challenging multi-modal eNTERFACE'05 corpus with highly encouraging results. The developed system represents a feasible solution to emotion recognition that can easily be integrated into various systems, such as humanoid robots, smart surveillance systems and alike. |
Peer, Peter; Bule, Jernej; Gros, Jerneja Žganec; Štruc, Vitomir Building cloud-based biometric services Članek v strokovni reviji V: Informatica, vol. 37, no. 2, str. 115, 2013. Povzetek | Povezava | BibTeX | Oznake: biometrics, cloud computing, development. SaaS, face recognition, fingerprint recognition @article{peer2013building,Over the next few years the amount of biometric data being at the disposal of various agencies and authentication service providers is expected to grow significantly. Such quantities of data require not only enormous amounts of storage but unprecedented processing power as well. To be able to face this future challenges more and more people are looking towards cloud computing, which can address these challenges quite effectively with its seemingly unlimited storage capacity, rapid data distribution and parallel processing capabilities. Since the available literature on how to implement cloud-based biometric services is extremely scarce, this paper capitalizes on the most important challenges encountered during the development work on biometric services, presents the most important standards and recommendations pertaining to biometric services in the cloud and ultimately, elaborates on the potential value of cloud-based biometric solutions by presenting a few existing (commercial) examples. In the final part of the paper, a case study on fingerprint recognition in the cloud and its integration into the e-learning environment Moodle is presented. |
Kenk, Vildana Sulič; Križaj, Janez; Štruc, Vitomir; Dobrišek, Simon Smart surveillance technologies in border control Članek v strokovni reviji V: European Journal of Law and Technology, vol. 4, no. 2, 2013. Povzetek | Povezava | BibTeX | Oznake: border control, proportionality, smart surveillance, surveillance, surveillance technology @article{kenk2013smart,The paper addresses the technical and legal aspects of the existing and forthcoming intelligent ('smart') surveillance technologies that are (or are considered to be) employed in the border control application area. Such technologies provide a computerized decision-making support to border control authorities, and are intended to increase the reliability and efficiency of border control measures. However, the question that arises is how effective these technologies are, as well as at what price, economically, socially, and in terms of citizens' rights. The paper provides a brief overview of smart surveillance technologies in border control applications, especially those used for controlling cross-border traffic, discusses possible proportionality issues and privacy risks raised by the increasingly widespread use of such technologies, as well as good/best practises developed in this area. In a broader context, the paper presents the result of the research carried out as part of the SMART (Scalable Measures for Automated Recognition Technologies) project. |
Štruc, Vitomir; Pavešić, Nikola; Žganec-Gros, Jerneja; Vesnicer, Boštjan Patch-wise low-dimensional probabilistic linear discriminant analysis for Face Recognition Proceedings Article V: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), str. 2352–2356, IEEE 2013. Povzetek | Povezava | BibTeX | Oznake: biometrics, face verification, FRGC, ICASSP, patch-wise approach, plda, probabilistic linear discriminant analysis @inproceedings{vstruc2013patch,The paper introduces a novel approach to face recognition based on the recently proposed low-dimensional probabilistic linear discriminant analysis (LD-PLDA). The proposed approach is specifically designed for complex recognition tasks, where highly nonlinear face variations are typically encountered. Such data variations are commonly induced by changes in the external illumination conditions, viewpoint changes or expression variations and represent quite a challenge even for state-of-the-art techniques, such as LD-PLDA. To overcome this problem, we propose here a patch-wise form of the LDPLDA technique (i.e., PLD-PLDA), which relies on local image patches rather than the entire image to make inferences about the identity of the input images. The basic idea here is to decompose the complex face recognition problem into simpler problems, for which the linear nature of the LD-PLDA technique may be better suited. By doing so, several similarity scores are derived from one facial image, which are combined at the final stage using a simple sum-rule fusion scheme to arrive at a single score that can be employed for identity inference. We evaluate the proposed technique on experiment 4 of the Face Recognition Grand Challenge (FRGCv2) database with highly promising results. |
Günther, Manuel; Costa-Pazo, Artur; Ding, Changxing; Boutellaa, Elhocine; Chiachia, Giovani; Zhang, Honglei; de Angeloni, Marcus Assis; Štruc, Vitomir; Khoury, Elie; Vazquez-Fernandez, Esteban; others, The 2013 face recognition evaluation in mobile environment Proceedings Article V: Proceedings of the IAPR International Conference on Biometrics (ICB), str. 1–7, IAPR 2013. Povzetek | Povezava | BibTeX | Oznake: biometrics, competition, face recognition, face verification, group evaluation, mobile biometrics, MOBIO, performance evaluation @inproceedings{gunther20132013,Automatic face recognition in unconstrained environments is a challenging task. To test current trends in face recognition algorithms, we organized an evaluation on face recognition in mobile environment. This paper presents the results of 8 different participants using two verification metrics. Most submitted algorithms rely on one or more of three types of features: local binary patterns, Gabor wavelet responses including Gabor phases, and color information. The best results are obtained from UNILJ-ALP, which fused several image representations and feature types, and UCHU, which learns optimal features with a convolutional neural network. Additionally, we assess the usability of the algorithms in mobile devices with limited resources. |
Stres, Blaz; Sul, Woo Jun; Murovec, Bostjan; Tiedje, James M Recently Deglaciated High-Altitude Soils of the Himalaya: Diverse Environments, Heterogenous Bacterial Communities and Long-Range Dust Inputs from the Upper Troposphere Članek v strokovni reviji V: PLOS ONE, vol. 8, no. 9, str. 1-10, 2013. Povzetek | Povezava | BibTeX | Oznake: @article{10.1371/journal.pone.0076440,Background The Himalaya with its altitude and geographical position forms a barrier to atmospheric transport, which produces much aqueous-particle monsoon precipitation and makes it the largest continuous ice-covered area outside polar regions. There is a paucity of data on high-altitude microbial communities, their native environments and responses to environmental-spatial variables relative to seasonal and deglaciation events. Methodology/Principal Findings Soils were sampled along altitude transects from 5000 m to 6000 m to determine environmental, spatial and seasonal factors structuring bacterial communities characterized by 16 S rRNA gene deep sequencing. Dust traps and fresh-snow samples were used to assess dust abundance and viability, community structure and abundance of dust associated microbial communities. Significantly different habitats among the altitude-transect samples corresponded to both phylogenetically distant and closely-related communities at distances as short as 50 m showing high community spatial divergence. High within-group variability that was related to an order of magnitude higher dust deposition obscured seasonal and temporal rearrangements in microbial communities. Although dust particle and associated cell deposition rates were highly correlated, seasonal dust communities of bacteria were distinct and differed significantly from recipient soil communities. Analysis of closest relatives to dust OTUs, HYSPLIT back-calculation of airmass trajectories and small dust particle size (4–12 µm) suggested that the deposited dust and microbes came from distant continental, lacustrine and marine sources, e.g. Sahara, India, Caspian Sea and Tibetan plateau. Cyanobacteria represented less than 0.5% of microbial communities suggesting that the microbial communities benefitted from (co)deposited carbon which was reflected in the psychrotolerant nature of dust-particle associated bacteria. Conclusions/Significance The spatial, environmental and temporal complexity of the high-altitude soils of the Himalaya generates ongoing disturbance and colonization events that subject heterogeneous microniches to stochastic colonization by far away dust associated microbes and result in the observed spatially divergent bacterial communities. |
Murovec, Boštjan; Perš, Janez; Mandeljc, Rok; Kenk, Vildana Sulić; Kovačič, Stanislav Towards commoditized smart-camera design Članek v strokovni reviji V: Journal of Systems Architecture, vol. 59, no. 10, Part A, str. 847 - 858, 2013, ISSN: 1383-7621, (Smart Camera Architecture). Povzetek | Povezava | BibTeX | Oznake: Commoditized smart camera, Design principles, General-purpose smart camera, Reference design, Visual-sensor networks @article{MUROVEC2013847,We propose a set of design principles for a cost-effective embedded smart camera. Our aim is to alleviate the shortcomings of the existing designs, such as excessive reliance on battery power and wireless networking, over-emphasized focus on specific use cases, and use of specialized technologies. In our opinion, these shortcomings prevent widespread commercialization and adoption of embedded smart cameras, especially in the context of visual-sensor networks. The proposed principles lead to a distinctively different design, which relies on commoditized, standardized and widely-available components, tools and knowledge. As an example of using these principles in practice, we present a smart camera, which is inexpensive, easy to build and support, capable of high-speed communication and enables rapid transfer of computer-vision algorithms to the embedded world. |
2012 |
Križaj, Janez; Štruc, Vitomir; Dobrišek, Simon Robust 3D Face Recognition Članek v strokovni reviji V: Electrotechnical Review, vol. 79, no. 1-2, str. 1-6, 2012. Povzetek | Povezava | BibTeX | Oznake: 3d face recognition, biometrics, gaussian mixture models, GMM, modeling @article{Križaj-EV-2012,Face recognition in uncontrolled environments is hindered by variations in illumination, pose, expression and occlusions of faces. Many practical face-recognition systems are affected by these variations. One way to increase the robustness to illumination and pose variations is to use 3D facial images. In this paper 3D face-recognition systems are presented. Their structure and operation are described. The robustness of such systems to variations in uncontrolled environments is emphasized. We present some preliminary results of a system developed in our laboratory. |
Križaj, Janez; Štruc, Vitomir; Dobrišek, Simon Towards robust 3D face verification using Gaussian mixture models Članek v strokovni reviji V: International Journal of Advanced Robotic Systems, vol. 9, 2012. Povzetek | Povezava | BibTeX | Oznake: @article{krizaj2012towards,This paper focuses on the use of Gaussian Mixture models (GMM) for 3D face verification. A special interest is taken in practical aspects of 3D face verification systems, where all steps of the verification procedure need to be automated and no meta-data, such as pre-annotated eye/nose/mouth positions, is available to the system. In such settings the performance of the verification system correlates heavily with the performance of the employed alignment (i.e., geometric normalization) procedure. We show that popular holistic as well as local recognition techniques, such as principal component analysis (PCA), or Scale-invariant feature transform (SIFT)-based methods considerably deteriorate in their performance when an “imperfect” geometric normalization procedure is used to align the 3D face scans and that in these situations GMMs should be preferred. Moreover, several possibilities to improve the performance and robustness of the classical GMM framework are presented and evaluated: i) explicit inclusion of spatial information, during the GMM construction procedure, ii) implicit inclusion of spatial information during the GMM construction procedure and iii) on-line evaluation and possible rejection of local feature vectors based on their likelihood. We successfully demonstrate the feasibility of the proposed modifications on the Face Recognition Grand Challenge data set. |
Vesnicer, Bostjan; Gros, Jerneja Žganec; Pavešić, Nikola; Štruc, Vitomir Face recognition using simplified probabilistic linear discriminant analysis Članek v strokovni reviji V: International Journal of Advanced Robotic Systems, vol. 9, 2012. Povzetek | Povezava | BibTeX | Oznake: biometrics, face recognition, plda, simplified PLDA @article{vesnicer2012face,Face recognition in uncontrolled environments remains an open problem that has not been satisfactorily solved by existing recognition techniques. In this paper, we tackle this problem using a variant of the recently proposed Probabilistic Linear Discriminant Analysis (PLDA). We show that simplified versions of the PLDA model, which are regularly used in the field of speaker recognition, rely on certain assumptions that not only result in a simpler PLDA model, but also reduce the computational load of the technique and - as indicated by our experimental assessments - improve recognition performance. Moreover, we show that, contrary to the general belief that PLDA-based methods produce well calibrated verification scores, score normalization techniques can still deliver significant performance gains, but only if non-parametric score normalization techniques are employed. Last but not least, we demonstrate the competitiveness of the simplified PLDA model for face recognition by comparing our results with the state-of-the-art results from the literature obtained on the second version of the large-scale Face Recognition Grand Challenge (FRGC) database. |
2011 |
Štruc, Vitomir; Pavešić, Nikola Photometric normalization techniques for illumination invariance Book Section V: Zhang, Yu-Jin (Ur.): Advances in Face Image Analysis: Techniques and Technologies, str. 279-300, IGI-Global, 2011. Povzetek | Povezava | BibTeX | Oznake: biometrics, face recognition, illumination invariance, illumination normalization, photometric normalization @incollection{IGI2011,Face recognition technology has come a long way since its beginnings in the previous century. Due to its countless application possibilities, it has attracted the interest of research groups from universities and companies around the world. Thanks to this enormous research effort, the recognition rates achievable with the state-of-the-art face recognition technology are steadily growing, even though some issues still pose major challenges to the technology. Amongst these challenges, coping with illumination-induced appearance variations is one of the biggest and still not satisfactorily solved. A number of techniques have been proposed in the literature to cope with the impact of illumination ranging from simple image enhancement techniques, such as histogram equalization, to more elaborate methods, such as anisotropic smoothing or the logarithmic total variation model. This chapter presents an overview of the most popular and efficient normalization techniques that try to solve the illumination variation problem at the preprocessing level. It assesses the techniques on the YaleB and XM2VTS databases and explores their strengths and weaknesses from the theoretical and implementation point of view. |
Štruc, Vitomir; Žganec-Gros, Jerneja; Pavešić, Nikola Principal directions of synthetic exact filters for robust real-time eye localization Proceedings Article V: Proceedings of the COST workshop on Biometrics and Identity Management (BioID), str. 180/192, Springer-Verlag, Berlin, Heidelberg, 2011. Povzetek | Povezava | BibTeX | Oznake: ASEF, correlation filters, eye localization, face image processing, landmark localization, landmarking, PSEF @inproceedings{BioID_Struc_2011,The alignment of the facial region with a predefined canonical form is one of the most crucial steps in a face recognition system. Most of the existing alignment techniques rely on the position of the eyes and, hence, require an efficient and reliable eye localization procedure. In this paper we propose a novel technique for this purpose, which exploits a new class of correlation filters called Principal directions of Synthetic Exact Filters (PSEFs). The proposed filters represent a generalization of the recently proposed Average of Synthetic Exact Filters (ASEFs) and exhibit desirable properties, such as relatively short training times, computational simplicity, high localization rates and real time capabilities. We present the theory of PSEF filter construction, elaborate on their characteristics and finally develop an efficient procedure for eye localization using several PSEF filters. We demonstrate the effectiveness of the proposed class of correlation filters for the task of eye localization on facial images from the FERET database and show that for the tested task they outperform the established Haar cascade object detector as well as the ASEF correlation filters. |
2010 |
Štruc, Vitomir; Pavešić, Nikola Face recogniton from color images using sparse projection analysis Proceedings Article V: Proceedings of the 7th International Conference on Image Analysis and Recognition (ICIAR 2010), str. 445-453, Povoa de Varzim, Portugal, 2010. Povzetek | Povezava | BibTeX | Oznake: biometrics, face verification, ICIAR, performance evaluation, sparse projection analysis @inproceedings{ICIAR2010_Sparse,The paper presents a novel feature extraction technique for face recognition which uses sparse projection axes to compute a lowdimensional representation of face images. The proposed technique derives the sparse axes by first recasting the problem of face recognition as a regression problem and then solving the new (under-determined) regression problem by computing the solution with minimum L1 norm. The developed technique, named Sparse Projection Analysis (SPA), is applied to color as well as grey-scale images from the XM2VTS database and compared to popular subspace projection techniques (with sparse and dense projection axes) from the literature. The results of the experimental assessment show that the proposed technique ensures promising results on un-occluded as well occluded images from the XM2VTS database. |
Križaj, Janez; Štruc, Vitomir; Pavešić, Nikola Adaptation of SIFT Features for Robust Face Recognition Proceedings Article V: Proceedings of the 7th International Conference on Image Analysis and Recognition (ICIAR 2010), str. 394-404, Povoa de Varzim, Portugal, 2010. Povzetek | Povezava | BibTeX | Oznake: biometrics, dense SIFT, face recognition, performance evaluation, SIFT, SIFT features @inproceedings{ICIAR2010_Sift,The Scale Invariant Feature Transform (SIFT) is an algorithm used to detect and describe scale-, translation- and rotation-invariant local features in images. The original SIFT algorithm has been successfully applied in general object detection and recognition tasks, panorama stitching and others. One of its more recent uses also includes face recognition, where it was shown to deliver encouraging results. SIFT-based face recognition techniques found in the literature rely heavily on the so-called keypoint detector, which locates interest points in the given image that are ultimately used to compute the SIFT descriptors. While these descriptors are known to be among others (partially) invariant to illumination changes, the keypoint detector is not. Since varying illumination is one of the main issues affecting the performance of face recognition systems, the keypoint detector represents the main source of errors in face recognition systems relying on SIFT features. To overcome the presented shortcoming of SIFT-based methods, we present in this paper a novel face recognition technique that computes the SIFT descriptors at predefined (fixed) locations learned during the training stage. By doing so, it eliminates the need for keypoint detection on the test images and renders our approach more robust to illumination changes than related approaches from the literature. Experiments, performed on the Extended Yale B face database, show that the proposed technique compares favorably with several popular techniques from the literature in terms of performance. |
Štruc, Vitomir; Vesnicer, Boštjan; Mihelič, France; Pavešić, Nikola Removing Illumination Artifacts from Face Images using the Nuisance Attribute Projection Proceedings Article V: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'10), str. 846-849, IEEE, Dallas, Texas, USA, 2010. Povzetek | Povezava | BibTeX | Oznake: biometrics, face recognition, face verification, illumination changes, illumination invariance, nuisance attribute projection, robust recognition @inproceedings{ICASSP2010,Illumination induced appearance changes represent one of the open challenges in automated face recognition systems still significantly influencing their performance. Several techniques have been presented in the literature to cope with this problem; however, a universal solution remains to be found. In this paper we present a novel normalization scheme based on the nuisance attribute projection (NAP), which tries to remove the effects of illumination by projecting away multiple dimensions of a low dimensional illumination subspace. The technique is assessed in face recognition experiments performed on the extended YaleB and XM2VTS databases. Comparative results with state-of-the-art techniques show the competitiveness of the proposed technique. |
Štruc, Vitomir; Pavešić, Nikola V: Oravec, Milos (Ur.): Face Recognition, str. 215-238, In-Tech, Vienna, 2010. Povezava | BibTeX | Oznake: biometrics, face recognition, feature extraction, Gabor features, Gabor filters, illumination changes, phase features @incollection{InTech2010, |
Štruc, Vitomir; Pavešić, Nikola The Complete Gabor-Fisher Classifier for Robust Face Recognition Članek v strokovni reviji V: EURASIP Advances in Signal Processing, vol. 2010, str. 26, 2010. Povzetek | Povezava | BibTeX | Oznake: biometrics, combined model, face recognition, feature extraction, Gabor features, phase features @article{CGF-Struc_2010,This paper develops a novel face recognition technique called Complete Gabor Fisher Classifier (CGFC). Different from existing techniques that use Gabor filters for deriving the Gabor face representation, the proposed approach does not rely solely on Gabor magnitude information but effectively uses features computed based on Gabor phase information as well. It represents one of the few successful attempts found in the literature of combining Gabor magnitude and phase information for robust face recognition. The novelty of the proposed CGFC technique comes from (1) the introduction of a Gabor phase-based face representation and (2) the combination of the recognition technique using the proposed representation with classical Gabor magnitude-based methods into a unified framework. The proposed face recognition framework is assessed in a series of face verification and identification experiments performed on the XM2VTS, Extended YaleB, FERET, and AR databases. The results of the assessment suggest that the proposed technique clearly outperforms state-of-the-art face recognition techniques from the literature and that its performance is almost unaffected by the presence of partial occlusions of the facial area, changes in facial expression, or severe illumination changes. |
Poh, Norman; Chan, Chi Ho; Kittler, Josef; Marcel, Sebastien; Cool, Christopher Mc; Rua, Enrique Argones; Castro, Jose Luis Alba; Villegas, Mauricio; Paredes, Roberto; Struc, Vitomir; others, An evaluation of video-to-video face verification Članek v strokovni reviji V: IEEE Transactions on Information Forensics and Security, vol. 5, no. 4, str. 781–801, 2010. Povzetek | Povezava | BibTeX | Oznake: biometrics, competition, face recognition, face verification, group evaluation, video @article{poh2010evaluation,Person recognition using facial features, e.g., mug-shot images, has long been used in identity documents. However, due to the widespread use of web-cams and mobile devices embedded with a camera, it is now possible to realize facial video recognition, rather than resorting to just still images. In fact, facial video recognition offers many advantages over still image recognition; these include the potential of boosting the system accuracy and deterring spoof attacks. This paper presents an evaluation of person identity verification using facial video data, organized in conjunction with the International Conference on Biometrics (ICB 2009). It involves 18 systems submitted by seven academic institutes. These systems provide for a diverse set of assumptions, including feature representation and preprocessing variations, allowing us to assess the effect of adverse conditions, usage of quality information, query selection, and template construction for video-to-video face authentication. |
Štruc, Vitomir; Dobrišek, Simon; Pavešić, Nikola Confidence Weighted Subspace Projection Techniques for Robust Face Recognition in the Presence of Partial Occlusions Proceedings Article V: Proceedings of the International Conference on Pattern Recognition (ICPR'10), str. 1334-1338, Istanbul, Turkey, 2010. Povezava | BibTeX | Oznake: biometrics, face recognition, face verification, ICPR, performance evaluation, subspace projection @inproceedings{ICPR_Struc_2010, |
Gajšek, Rok; Štruc, Vitomir; Mihelič, France Multi-modal Emotion Recognition using Canonical Correlations and Acustic Features Proceedings Article V: Proceedings of the International Conference on Pattern Recognition (ICPR), str. 4133-4136, IAPR Istanbul, Turkey, 2010. Povzetek | Povezava | BibTeX | Oznake: acustic features, canonical correlations, emotion recognition, facial expression recognition, multi modality, speech processing, speech technologies @inproceedings{ICPR_Gajsek_2010,The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classifier is used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does not rely on the tracking of specific facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm fails at detecting the correct area. The system is evaluated on the eNTERFACE database and the recognition accuracy of our audio-video fusion is compared to the published results in the literature. |
Gajšek, Rok; Štruc, Vitomir; Mihelič, France Multi-modal Emotion Recognition based on the Decoupling of Emotion and Speaker Information Proceedings Article V: Proceedings of Text, Speech and Dialogue (TSD), str. 275-282, Springer-Verlag, Berlin, Heidelberg, 2010. Povzetek | Povezava | BibTeX | Oznake: emotion recognition, facial expression recognition, multi modality, speech processing, speech technologies, spontaneous emotions, video processing @inproceedings{TSD_Emo_Gajsek,The standard features used in emotion recognition carry, besides the emotion related information, also cues about the speaker. This is expected, since the nature of emotionally colored speech is similar to the variations in the speech signal, caused by different speakers. Therefore, we present a gradient descent derived transformation for the decoupling of emotion and speaker information contained in the acoustic features. The Interspeech ’09 Emotion Challenge feature set is used as the baseline for the audio part. A similar procedure is employed on the video signal, where the nuisance attribute projection (NAP) is used to derive the transformation matrix, which contains information about the emotional state of the speaker. Ultimately, different NAP transformation matrices are compared using canonical correlations. The audio and video sub-systems are combined at the matching score level using different fusion techniques. The presented system is assessed on the publicly available eNTERFACE’05 database where significant improvements in the recognition performance are observed when compared to the stat-of-the-art baseline. |
Štruc, Vitomir; Žganec-Gros, Jerneja; Pavešić, Nikola Eye Localization using correlation filters Proceedings Article V: Proceedings of the International Conference DOGS, str. 188-191, Novi Sad, Serbia, 2010. BibTeX | Oznake: ASEF, correlation filters, eye localization, face image processing, landmark localization, PSEF @inproceedings{DOGS_Struc_2010, |
Murovec, Boštjan; Tiedje, James M; Stres, Blaž DNA encoding for an efficient 'Omics processing Članek v strokovni reviji V: Computer Methods and Programs in Biomedicine, vol. 100, no. 2, str. 175 - 190, 2010, ISSN: 0169-2607. Povzetek | Povezava | BibTeX | Oznake: Binary DNA encoding, Binary nucleotide encoding, DNA-64, NumFASTA @article{MUROVEC2010175,The exponential growth of available DNA sequences and the increased interoperability of biological information is triggering intergovernmental efforts aimed at increasing the access, dissemination, and analysis of sequence data. Achieving the efficient storage and processing of DNA material is an important goal that parallels well with the foreseen coding standardization on the horizon. This paper proposes novel coding approaches, for both the dissemination and processing of sequences, where the speed of the DNA processing is shown to be boosted by exploring more than the normally utilized eight bits for encoding a single nucleotide. Further gains are achieved by encoding the nucleotides together with their trailing alignment information as a single 64-bit data structure. The paper also proposes a slight modification to the established FASTA scheme in order to improve on its representation of alignment information. The significance of the propositions is confirmed by the encouraging results from empirical tests. |
Objave
2016 |
Influence of alignment on ear recognition : case study on AWE Dataset Proceedings Article V: Proceedings of the Electrotechnical and Computer Science Conference (ERK), str. 131-134, Portorož, Slovenia, 2016. |
Preizkus Googlovega govornega programskega vmesnika pri samodejnem razpoznavanju govorjene slovenščine Proceedings Article V: Jezikovne tehnologije in digitalna humanistika, str. 47-51, 2016. |
Exploiting Spatio-Temporal Information for Light-Plane Labeling in Depth-Image Sensors Using Probabilistic Graphical Models Članek v strokovni reviji V: Informatica, vol. 27, no. 1, str. 67–84, 2016. |
Deep pair-wise similarity learning for face recognition Proceedings Article V: 4th International Workshop on Biometrics and Forensics (IWBF), str. 1–6, IEEE 2016. |
A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion Proceedings Article V: International Conference on Text, Speech, and Dialogue, str. 375–382, Springer 2016. |
2015 |
The pose-invariant similarity index for face recognition Proceedings Article V: Proceedings of the Electrotechnical and Computer Science Conference (ERK), Portorož, Slovenia, 2015. |
Modest face recognition Proceedings Article V: Proceedings of the International Workshop on Biometrics and Forensics (IWBF), str. 1–6, IEEE, 2015. |
Report on the FG 2015 video person recognition evaluation Proceedings Article V: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (IEEE FG), str. 1–8, IEEE 2015. |
Speaker de-identification using diphone recognition and speech synthesis Proceedings Article V: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (IEEE FG): DeID 2015, str. 1–7, IEEE 2015. |
Face recognition in the wild with the Probabilistic Gabor-Fisher Classifier Proceedings Article V: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (IEEE FG): BWild 2015, str. 1–6, IEEE 2015. |
Development and Evaluation of the Emotional Slovenian Speech Database-EmoLUKS Proceedings Article V: Proceedings of the International Conference on Text, Speech, and Dialogue (TSD), str. 351–359, Springer 2015. |
Facial Landmark Localization in Depth Images using Supervised Ridge Descent Proceedings Article V: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW): Chaa Learn, str. 136–141, 2015. |
Job-shop local-search move evaluation without direct consideration of the criterion’s value Članek v strokovni reviji V: European Journal of Operational Research, vol. 241, no. 2, str. 320 - 329, 2015, ISSN: 0377-2217. |
Methane Yield Database: Online infrastructure and bioresource for methane yield data and related metadata Članek v strokovni reviji V: Bioresource Technology, vol. 189, str. 217 - 223, 2015, ISSN: 0960-8524. |
Rumen microbial community composition varies with diet and host, but a core microbiome is found across a wide geographical range Članek v strokovni reviji V: Scientific reports, vol. art 14567, no. 5, str. 1–13, 2015, ISSN: 2045-2322. |
2014 |
Strategies for exploiting independent cloud implementations of biometric experts in multibiometric scenarios Članek v strokovni reviji V: Mathematical problems in engineering, vol. 2014, 2014. |
Beyond parametric score normalisation in biometric verification systems Članek v strokovni reviji V: IET Biometrics, vol. 3, no. 2, str. 62–74, 2014. |
A case study on multi-modal biometrics in the cloud Članek v strokovni reviji V: Electrotechnical Review, vol. 81, no. 3, str. 74, 2014. |
A Feasibility Study on the Use of Binary Keypoint Descriptors for 3D Face Recognition Proceedings Article V: Proceedings of the Mexican Conference on Pattern Recognition (MCPR), str. 142–151, Springer 2014. |
SIFT vs. FREAK: Assessing the usefulness of two keypoint descriptors for 3D face verification Proceedings Article V: 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), str. 1336–1341, Mipro Opatija, Croatia, 2014. |
An experimental tattoo de-identification system for privacy protection in still images Proceedings Article V: 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), str. 1288–1293, Mipro IEEE, 2014. |
Incorporating Duration Information into I-Vector-Based Speaker-Recognition Systems Proceedings Article V: Proceedings of Odyssey: The Speaker and Language Recognition Workshop, str. 241–248, 2014. |
The ijcb 2014 pasc video face and person recognition competition Proceedings Article V: Proceedings of the IEEE International Joint Conference on Biometrics (IJCB), str. 1–8, IEEE 2014. |
2013 |
Robust 3D face recognition using adapted statistical models Proceedings Article V: Proceedings of the Electrotechnical and Computer Science Conference (ERK'13), 2013. |
Zlivanje informacij za zanseljivo in robustno razpoznavanje obrazov Članek v strokovni reviji V: Electrotechnical Review, vol. 80, no. 3, str. 1-12, 2013. |
Exploiting representation plurality for robust and efficient face recognition Proceedings Article V: Proceedings of the 22nd Intenational Electrotechnical and Computer Science Conference (ERK'13), str. 121–124, Portorož, Slovenia, 2013. |
Combining 3D face representations using region covariance descriptors and statistical models Proceedings Article V: Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (IEEE FG), Workshop on 3D Face Biometrics, IEEE, Shanghai, China, 2013. |
Towards efficient multi-modal emotion recognition Članek v strokovni reviji V: International Journal of Advanced Robotic Systems, vol. 10, no. 53, 2013. |
Building cloud-based biometric services Članek v strokovni reviji V: Informatica, vol. 37, no. 2, str. 115, 2013. |
Smart surveillance technologies in border control Članek v strokovni reviji V: European Journal of Law and Technology, vol. 4, no. 2, 2013. |
Patch-wise low-dimensional probabilistic linear discriminant analysis for Face Recognition Proceedings Article V: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), str. 2352–2356, IEEE 2013. |
The 2013 face recognition evaluation in mobile environment Proceedings Article V: Proceedings of the IAPR International Conference on Biometrics (ICB), str. 1–7, IAPR 2013. |
Recently Deglaciated High-Altitude Soils of the Himalaya: Diverse Environments, Heterogenous Bacterial Communities and Long-Range Dust Inputs from the Upper Troposphere Članek v strokovni reviji V: PLOS ONE, vol. 8, no. 9, str. 1-10, 2013. |
Towards commoditized smart-camera design Članek v strokovni reviji V: Journal of Systems Architecture, vol. 59, no. 10, Part A, str. 847 - 858, 2013, ISSN: 1383-7621, (Smart Camera Architecture). |
2012 |
Robust 3D Face Recognition Članek v strokovni reviji V: Electrotechnical Review, vol. 79, no. 1-2, str. 1-6, 2012. |
Towards robust 3D face verification using Gaussian mixture models Članek v strokovni reviji V: International Journal of Advanced Robotic Systems, vol. 9, 2012. |
Face recognition using simplified probabilistic linear discriminant analysis Članek v strokovni reviji V: International Journal of Advanced Robotic Systems, vol. 9, 2012. |
2011 |
Photometric normalization techniques for illumination invariance Book Section V: Zhang, Yu-Jin (Ur.): Advances in Face Image Analysis: Techniques and Technologies, str. 279-300, IGI-Global, 2011. |
Principal directions of synthetic exact filters for robust real-time eye localization Proceedings Article V: Proceedings of the COST workshop on Biometrics and Identity Management (BioID), str. 180/192, Springer-Verlag, Berlin, Heidelberg, 2011. |
2010 |
Face recogniton from color images using sparse projection analysis Proceedings Article V: Proceedings of the 7th International Conference on Image Analysis and Recognition (ICIAR 2010), str. 445-453, Povoa de Varzim, Portugal, 2010. |
Adaptation of SIFT Features for Robust Face Recognition Proceedings Article V: Proceedings of the 7th International Conference on Image Analysis and Recognition (ICIAR 2010), str. 394-404, Povoa de Varzim, Portugal, 2010. |
Removing Illumination Artifacts from Face Images using the Nuisance Attribute Projection Proceedings Article V: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'10), str. 846-849, IEEE, Dallas, Texas, USA, 2010. |
V: Oravec, Milos (Ur.): Face Recognition, str. 215-238, In-Tech, Vienna, 2010. |
The Complete Gabor-Fisher Classifier for Robust Face Recognition Članek v strokovni reviji V: EURASIP Advances in Signal Processing, vol. 2010, str. 26, 2010. |
An evaluation of video-to-video face verification Članek v strokovni reviji V: IEEE Transactions on Information Forensics and Security, vol. 5, no. 4, str. 781–801, 2010. |
Confidence Weighted Subspace Projection Techniques for Robust Face Recognition in the Presence of Partial Occlusions Proceedings Article V: Proceedings of the International Conference on Pattern Recognition (ICPR'10), str. 1334-1338, Istanbul, Turkey, 2010. |
Multi-modal Emotion Recognition using Canonical Correlations and Acustic Features Proceedings Article V: Proceedings of the International Conference on Pattern Recognition (ICPR), str. 4133-4136, IAPR Istanbul, Turkey, 2010. |
Multi-modal Emotion Recognition based on the Decoupling of Emotion and Speaker Information Proceedings Article V: Proceedings of Text, Speech and Dialogue (TSD), str. 275-282, Springer-Verlag, Berlin, Heidelberg, 2010. |
Eye Localization using correlation filters Proceedings Article V: Proceedings of the International Conference DOGS, str. 188-191, Novi Sad, Serbia, 2010. |
DNA encoding for an efficient 'Omics processing Članek v strokovni reviji V: Computer Methods and Programs in Biomedicine, vol. 100, no. 2, str. 175 - 190, 2010, ISSN: 0169-2607. |