Publications

Show all

2026

Gan, Chenquan; Zhou, Daitao; Zhu, Qingyi; Wang, Xibin; Jain, Deepak Kumar; Štruc, Vitomir

Improving Emotion Recognition from Ambiguous Speech via Spatio-Temporal Spectrum Analysis and Real-Time Soft-Label Correction Journal Article

In: IEEE Transactions on Affective Computing, pp. 1-16, 2026.

Abstract | Links | BibTeX | Tags: deep learning, emotion recognition, speech, speech processing

2025

Gan, Chenquan; Zhou, Daitao; Wang, Kexin; Zhu, Qingyi; Jain, Deepak Kumar; Štruc, Vitomir

Optimizing ambiguous speech emotion recognition through spatial–temporal parallel network with label correction strategy Journal Article

In: Computer Vision and Image Understanding, vol. 260, no. 104483, pp. 1–14, 2025.

Abstract | Links | BibTeX | Tags: deep learning, emotion recognition, speech, speech processing, speech technologies

2009

Gajšek, Rok; Štruc, Vitomir; Dobrišek, Simon; Mihelič, France

Emotion recognition using linear transformations in combination with video Proceedings Article

In: Speech and intelligence: proceedings of Interspeech 2009, pp. 1967-1970, Brighton, UK, 2009.

Abstract | Links | BibTeX | Tags: emotion recognition, facial expression recognition, interspeech, speech, speech technologies, spontaneous emotions

Gajšek, Rok; Štruc, Vitomir; Mihelič, France; Podlesek, Anja; Komidar, Luka; Sočan, Gregor; Bajec, Boštjan

Multi-modal emotional database: AvID Journal Article

In: Informatica (Ljubljana), vol. 33, no. 1, pp. 101-106, 2009.

Abstract | Links | BibTeX | Tags: avid, database, dataset, emotion recognition, facial expression recognition, speech, speech technologies, spontaneous emotions

Gajšek, Rok; Štruc, Vitomir; Vesnicer, Boštjan; Podlesek, Anja; Komidar, Luka; Mihelič, France

Analysis and assessment of AvID: multi-modal emotional database Proceedings Article

In: Text, speech and dialogue / 12th International Conference, pp. 266-273, Springer-Verlag, Berlin, Heidelberg, 2009.

Abstract | Links | BibTeX | Tags: avid database, database, emotion recognition, multimodal database, speech, speech technologies

0000

Gan, Chenquan; Zhou, Daitao; Wang, Kexin; Zhu, Qingyi; Jain, Deepak Kumar; Štruc, Vitomir

Optimizing ambiguous speech emotion recognition through spatial–temporal parallel network with label correction strategy Journal Article

In: Computer Vision and Image Understanding, vol. 260, no. 104483, pp. 1–14, 0000.

Abstract | Links | BibTeX | Tags: deep learning, emotion recognition, speech, speech processing, speech technologies