Publications – Laboratory for Machine Intelligence

Gan, Chenquan; Zheng, Jiahao; Zhu, Qingyi; Jain, Deepak Kumar; Vitomir vStruc,

A graph neural network with context filtering and feature correction for conversational emotion recognition Journal Article

In: Information Sciences, vol. 658, no. 120017, pp. 1-21, 2024.

Abstract | Links | BibTeX | Tags: context filtering, conversations, dialogue, emotion recognition, graph neural network, sentiment analysis

@article{InformSciences2024,

title = {A graph neural network with context filtering and feature correction for conversational emotion recognition},

author = {Chenquan Gan and Jiahao Zheng and Qingyi Zhu and Deepak Kumar Jain and Vitomir {v{S}}truc, },

url = {https://www.sciencedirect.com/science/article/pii/S002002552301602X?via%3Dihub

https://lmi.fe.uni-lj.si/wp-content/uploads/2023/12/InformationSciences.pdf},

doi = {https://doi.org/10.1016/j.ins.2023.120017},

year  = {2024},

date = {2024-02-01},

journal = {Information Sciences},

volume = {658},

number = {120017},

pages = {1-21},

abstract = {Conversational emotion recognition represents an important machine-learning problem with a wide variety of deployment possibilities. The key challenge in this area is how to properly capture the key conversational aspects that facilitate reliable emotion recognition, including utterance semantics, temporal order, informative contextual cues, speaker interactions as well as other relevant factors. In this paper, we present a novel Graph Neural Network approach for conversational emotion recognition at the utterance level. Our method addresses the outlined challenges and represents conversations in the form of graph structures that naturally encode temporal order, speaker dependencies, and even long-distance context. To efficiently capture the semantic content of the conversations, we leverage the zero-shot feature-extraction capabilities of pre-trained large-scale language models and then integrate two key contributions into the graph neural network to ensure competitive recognition results. The first is a novel context filter that establishes meaningful utterance dependencies for the graph construction procedure and removes low-relevance and uninformative utterances from being used as a source of contextual information for the recognition task. The second contribution is a feature-correction procedure that adjusts the information content in the generated feature representations through a gating mechanism to improve their discriminative power and reduce emotion-prediction errors. We conduct extensive experiments on four commonly used conversational datasets, i.e., IEMOCAP, MELD, Dailydialog, and EmoryNLP, to demonstrate the capabilities of the developed graph neural network with context filtering and error-correction capabilities. The results of the experiments point to highly promising performance, especially when compared to state-of-the-art competitors from the literature.},

keywords = {context filtering, conversations, dialogue, emotion recognition, graph neural network, sentiment analysis},

pubstate = {published},

tppubtype = {article}

}

Close

Gan, Chenquan; Yang, Yucheng; Zhub, Qingyi; Jain, Deepak Kumar; Struc, Vitomir

DHF-Net: A hierarchical feature interactive fusion network for dialogue emotion recognition Journal Article

In: Expert Systems with Applications, vol. 210, 2022.

Abstract | Links | BibTeX | Tags: attention, CNN, deep learning, dialogue, emotion recognition, fusion, fusion network, nlp, semantics, text, text processing

Dobrišek, Simon; Gajšek, Rok; Mihelič, France; Pavešić, Nikola; Štruc, Vitomir

Towards efficient multi-modal emotion recognition Journal Article

In: International Journal of Advanced Robotic Systems, vol. 10, no. 53, 2013.

Abstract | Links | BibTeX | Tags: avid database, emotion recognition, facial expression recognition, multi modality, speech technologies

@article{dobrivsek2013towards,

title = {Towards efficient multi-modal emotion recognition},

author = {Simon Dobrišek and Rok Gajšek and France Mihelič and Nikola Pavešić and Vitomir Štruc},

url = {https://lmi.fe.uni-lj.si/en/towardsefficientmulti-modalemotionrecognition/},

doi = {10.5772/54002},

year  = {2013},

date = {2013-01-01},

urldate = {2013-01-01},

journal = {International Journal of Advanced Robotic Systems},

volume = {10},

number = {53},

abstract = {The paper presents a multi-modal emotion recognition system exploiting audio and video (i.e., facial expression) information. The system first processes both sources of information individually to produce corresponding matching scores and then combines the computed matching scores to obtain a classification decision. For the video part of the system, a novel approach to emotion recognition, relying on image-set matching, is developed. The proposed approach avoids the need for detecting and tracking specific facial landmarks throughout the given video sequence, which represents a common source of error in video-based emotion recognition systems, and, therefore, adds robustness to the video processing chain. The audio part of the system, on the other hand, relies on utterance-specific Gaussian Mixture Models (GMMs) adapted from a Universal Background Model (UBM) via the maximum a posteriori probability (MAP) estimation. It improves upon the standard UBM-MAP procedure by exploiting gender information when building the utterance-specific GMMs, thus ensuring enhanced emotion recognition performance. Both the uni-modal parts as well as the combined system are assessed on the challenging multi-modal eNTERFACE'05 corpus with highly encouraging results. The developed system represents a feasible solution to emotion recognition that can easily be integrated into various systems, such as humanoid robots, smart surveillance systems and alike.},

keywords = {avid database, emotion recognition, facial expression recognition, multi modality, speech technologies},

pubstate = {published},

tppubtype = {article}

}

Close

Gajšek, Rok; Štruc, Vitomir; Mihelič, France

Multi-modal Emotion Recognition using Canonical Correlations and Acustic Features Proceedings Article

In: Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 4133-4136, IAPR Istanbul, Turkey, 2010.

Abstract | Links | BibTeX | Tags: acustic features, canonical correlations, emotion recognition, facial expression recognition, multi modality, speech processing, speech technologies

Gajšek, Rok; Štruc, Vitomir; Mihelič, France

Multi-modal Emotion Recognition based on the Decoupling of Emotion and Speaker Information Proceedings Article

In: Proceedings of Text, Speech and Dialogue (TSD), pp. 275-282, Springer-Verlag, Berlin, Heidelberg, 2010.

Abstract | Links | BibTeX | Tags: emotion recognition, facial expression recognition, multi modality, speech processing, speech technologies, spontaneous emotions, video processing

Gajšek, Rok; Štruc, Vitomir; Dobrišek, Simon; Mihelič, France

Emotion recognition using linear transformations in combination with video Proceedings Article

In: Speech and intelligence: proceedings of Interspeech 2009, pp. 1967-1970, Brighton, UK, 2009.

Abstract | Links | BibTeX | Tags: emotion recognition, facial expression recognition, interspeech, speech, speech technologies, spontaneous emotions

Gajšek, Rok; Štruc, Vitomir; Mihelič, France; Podlesek, Anja; Komidar, Luka; Sočan, Gregor; Bajec, Boštjan

Multi-modal emotional database: AvID Journal Article

In: Informatica (Ljubljana), vol. 33, no. 1, pp. 101-106, 2009.

Abstract | Links | BibTeX | Tags: avid, database, dataset, emotion recognition, facial expression recognition, speech, speech technologies, spontaneous emotions

Gajšek, Rok; Štruc, Vitomir; Dobrišek, Simon; Žibert, Janez; Mihelič, France; Pavešić, Nikola

Combining audio and video for detection of spontaneous emotions Proceedings Article

In: Biometric ID management and multimodal communication, pp. 114-121, Springer-Verlag, Berlin, Heidelberg, 2009.

Abstract | Links | BibTeX | Tags: emotion recognition, facial expression recognition, performance evaluation, speech processing, speech technologies

Gajšek, Rok; Štruc, Vitomir; Vesnicer, Boštjan; Podlesek, Anja; Komidar, Luka; Mihelič, France

Analysis and assessment of AvID: multi-modal emotional database Proceedings Article

In: Text, speech and dialogue / 12th International Conference, pp. 266-273, Springer-Verlag, Berlin, Heidelberg, 2009.

Abstract | Links | BibTeX | Tags: avid database, database, emotion recognition, multimodal database, speech, speech technologies

Gajšek, Rok; Podlesek, Anja; Komidar, Luka; Sočan, Grekor; Bajec, Boštjan; Štruc, Vitomir; Bucik, Valentin; Mihelič, France

AvID: audio-video emotional database Proceedings Article

In: Proceedings of the 11th International Multi-conference Information Society (IS'08), pp. 70-74, Ljubljana, Slovenia, 2008.

BibTeX | Tags: database, dataset, emotion recognition, facial expression recognition, multimodal database, speech technology, spontaneous emotions