MMEmAsis: multimodal emotion and sentiment analysis

Gleb A. Kiselev; Киселёв Г. А.; Yaroslava M. Lubysheva; Лубышева Я. М.; Daniil A. Weizenfeld; Вейценфельд Д. А.

doi:10.22363/2658-4670-2024-32-4-370-379

MMEmAsis: multimodal emotion and sentiment analysis

Authors: Kiselev G.A.¹^,2, Lubysheva Y.M.¹, Weizenfeld D.A.¹^,2
Affiliations:
1. RUDN University
2. Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences
Issue: Vol 32, No 4 (2024)
Pages: 370-379
Section: Computer Science
URL: https://journals.rcsi.science/2658-4670/article/view/315407
DOI: https://doi.org/10.22363/2658-4670-2024-32-4-370-379
EDN: https://elibrary.ru/EPGKRU
ID: 315407

Cite item

Full Text

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

The paper presents a new multimodal approach to analyzing the psycho-emotional state of a person using nonlinear classifiers. The main modalities are the subject’s speech data and video data of facial expressions. Speech is digitized and transcribed using the Scribe library, and then mood cues are extracted using the Titanis sentiment analyzer from the FRC CSC RAS. For visual analysis, two different approaches were implemented: a pre-trained ResNet model for direct sentiment classification from facial expressions, and a deep learning model that integrates ResNet with a graph-based deep neural network for facial recognition. Both approaches have faced challenges related to environmental factors affecting the stability of results. The second approach demonstrated greater flexibility with adjustable classification vocabularies, which facilitated post-deployment calibration. Integration of text and visual data has significantly improved the accuracy and reliability of the analysis of a person’s psycho-emotional state

Keywords

dataset, emotion analysis, multimodal data mining, artificial intelligence, machine learning, deep learning, neuroscience data mining

About the authors

Gleb A. Kiselev

RUDN University; Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences

Email: kiselev@isa.ru
ORCID iD: 0000-0001-9231-8662
Scopus Author ID: 57195683637
ResearcherId: Y-6971-2018

Candidate of Technical Sciences, Senior Lecturer at the Department of Mathematical Modeling and Artificial Intelligence of RUDN University; Researcher of Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences

6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation; 44 Vavilova St, bldg 2, Moscow 119333, Russian Federation

Yaroslava M. Lubysheva

RUDN University

Email: gorbunova_y_m@mail.ru
ORCID iD: 0000-0001-6280-6040

Master’s degree student of Department of Mathematical Modeling and Artificial Intelligence

6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation

Daniil A. Weizenfeld

RUDN University; Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences

Author for correspondence.
Email: veicenfeld@isa.ru
ORCID iD: 0000-0002-2787-0714

Master’s degree student of Department of Mechanics and Control Processes

6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation; 44 Vavilova St, bldg 2, Moscow 119333, Russian Federation

References

Piana, S., Staglianò, A., Odone, F.,Verri, A. & Camurri, A. Real-timeAutomaticEmotionRecognition from Body Gestures 2014. doi: 10.48550/arXiv.1402.5047.
Hu, G., Lin, T., Zhao, Y., Lu, G., Wu, Y. & Li, Y. UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. doi: 10.48550/arXiv.2211.11256 (2022).
Zhao, J., Zhang, T., Hu, J., Liu, Y., Jin, Q., Wang, X. & Li, H. M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Association for Computational Linguistics, Dublin, Ireland, May, 2022, 2022), 5699-5710. doi: 10.18653/v1/2022.acl-long.391.
Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E. & Mihalcea, R. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. doi: 10.48550/arXiv.1810.02508 (2018).
Ekman, P. Emotion: common characteristics and individual differences. Lecture presented at 8th World Congress of I.O.P. Tampere Finland (1996).
Levenson, R. W. The intrapersonal functions of emotion. Cognition & Emotion 13, 481-504 (1999).
Keltner, D. & Gross, J. Functional accounts of emotions. Cognition & Emotion 13, 467-480 (1999).
Ferdous, A., Bari, A. & Gavrilova, M. Emotion Recognition From Body Movement. IEEE Access. doi: 10.1109/ACCESS.2019.2963113 (Dec. 2019).
Zadeh, A., Liang, P., Poria, S., Cambria, E. & Morency, L.-P. Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (July 2018), 2236-2246. doi: 10.18653/v1/P18-1208.
Busso, C., Bulut, M. & Lee, C. e. a. IEMOCAP: interactive emotional dyadic motion capture database. Lang Resources & Evaluation 42, 335-359. doi: 10.1007/s10579-008-9076-6 (2008).
Kossaifi, J. et al. SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild. IEEE Transactions on Pattern Analysis and Machine Intelligence 13. doi: 10.1109/TPAMI.2019.2944808 (Oct. 2019).
O’Reilly, H., Pigat, D., Fridenson, S., Berggren, S., Tal, S., Golan, O., Bölte, S., Baron-Cohen, S. & Lundqvist, D. The EU-Emotion Stimulus Set: A validation study. Behav Res Methods 48, 567-576. doi: 10.3758/s13428-015-0601-4 (2016).
Soleymani, M., Lichtenauer, J., Pun, T. & Pantic, M. A Multimodal Database for Affect Recognition and Implicit Tagging. IEEE Transactions on Affective Computing 3, 42-55. doi: 10.1109/T-AFFC.2011.25 (2012).
Chou, H. C., Lin, W. C., Chang, L. C., Li, C. C., Ma, H. P. & Lee, C. C. NNIME: The NTHU-NTUA Chinese interactive multimodal emotion corpus in 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII) (2017), 292-298. doi: 10.1109/ACII.2017.8273615.
Ringeval, F., Sonderegger, A., Sauer, J. & Lalanne, D. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions in 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (2013), 1-8. doi: 10.1109/FG.2013.6553805.
Reznikova, J. I. Intelligence and language in animals and humans 253 pp. (Yurayt, 2016).
Samokhvalov, V. P., Kornetov, A. N., Korobov, A. A. & Kornetov, N. A. Ethology in psychiatry 217 pp. (Health, 1990).
Gullett, N., Zajkowska, Z., Walsh, A., Harper, R. & Mondelli, V. Heart rate variability (HRV) as a way to understand associations between the autonomic nervous system (ANS) and affective states: A critical review of the literature. International Journal of Psychophysiology 192, 35-42. doi: 10.1016/j.ijpsycho.2023.08.001 (2023).
Bondarenko, I. Pisets: A Python library and service for automatic speech recognition and transcribing in Russian and English https://github.com/bond005/pisets.
Savchenko, A. V. Facial expression and attributes recognition based on multi-task learning of lightweight neural networks in 2021 IEEE 19th International Symposium on Intelligent Systems and Informatics (SISY) (2021), 119-124.
Luo, C., Song, S., Xie, W., Shen, L. & Gunes, H. Learning multi-dimensional edge feature-based au relation graph for facial action unit recognition. arXiv preprint arXiv:2205.01782 (2022).
Gajarsky, T. Facetorch: A Python library for analysing faces using PyTorch https://github.com/tomasgajarsky/facetorch.
Deng, J., Guo, J., Ververas, E., Kotsia, I. & Zafeiriou, S. Retinaface: Single-shot multi-level face localisation in the wild in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020), 5203-5212.

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register

Vol 33, No 2 (2025)

Vol 33, No 2 (2025)

MMEmAsis: multimodal emotion and sentiment analysis

Full Text

Abstract

Keywords

About the authors

Gleb A. Kiselev

Yaroslava M. Lubysheva

Daniil A. Weizenfeld

References

Supplementary files