2009 |
Gajšek, Rok; Štruc, Vitomir; Dobrišek, Simon; Mihelič, France Emotion recognition using linear transformations in combination with video Proceedings Article In: Speech and intelligence: proceedings of Interspeech 2009, pp. 1967-1970, Brighton, UK, 2009. Abstract | Links | BibTeX | Tags: emotion recognition, facial expression recognition, interspeech, speech, speech technologies, spontaneous emotions @inproceedings{InterSp2009, The paper discuses the usage of linear transformations of Hidden Markov Models, normally employed for speaker and environment adaptation, as a way of extracting the emotional components from the speech. A constrained version of Maximum Likelihood Linear Regression (CMLLR) transformation is used as a feature for classification of normal or aroused emotional state. We present a procedure of incrementally building a set of speaker independent acoustic models, that are used to estimate the CMLLR transformations for emotion classification. An audio-video database of spontaneous emotions (AvID) is briefly presented since it forms the basis for the evaluation of the proposed method. Emotion classification using the video part of the database is also described and the added value of combining the visual information with the audio features is shown. |
Gajšek, Rok; Štruc, Vitomir; Mihelič, France; Podlesek, Anja; Komidar, Luka; Sočan, Gregor; Bajec, Boštjan Multi-modal emotional database: AvID Journal Article In: Informatica (Ljubljana), vol. 33, no. 1, pp. 101-106, 2009. Abstract | Links | BibTeX | Tags: avid, database, dataset, emotion recognition, facial expression recognition, speech, speech technologies, spontaneous emotions @article{Inform-Gajsek_2009, This paper presents our work on recording a multi-modal database containing emotional audio and video recordings. In designing the recording strategies a special attention was payed to gather data involving spontaneous emotions and therefore obtain a more realistic training and testing conditions for experiments. With specially planned scenarios including playing computer games and conducting an adaptive intelligence test different levels of arousal were induced. This will enable us to both detect different emotional states as well as experiment in speaker identification/verification of people involved in communications. So far the multi-modal database has been recorded and basic evaluation of the data was processed. |
Gajšek, Rok; Štruc, Vitomir; Vesnicer, Boštjan; Podlesek, Anja; Komidar, Luka; Mihelič, France Analysis and assessment of AvID: multi-modal emotional database Proceedings Article In: Text, speech and dialogue / 12th International Conference, pp. 266-273, Springer-Verlag, Berlin, Heidelberg, 2009. Abstract | Links | BibTeX | Tags: avid database, database, emotion recognition, multimodal database, speech, speech technologies @inproceedings{TSD2009, The paper deals with the recording and the evaluation of a multi modal (audio/video) database of spontaneous emotions. Firstly, motivation for this work is given and different recording strategies used are described. Special attention is given to the process of evaluating the emotional database. Different kappa statistics normally used in measuring the agreement between annotators are discussed. Following the problems of standard kappa coefficients, when used in emotional database assessment, a new time-weighted free-marginal kappa is presented. It differs from the other kappa statistics in that it weights each utterance's particular score of agreement based on the duration of the utterance. The new method is evaluated and the superiority over the standard kappa, when dealing with a database of spontaneous emotions, is demonstrated. |