Emotion Recognition System from Speech using Convolutional Neural Networks

Metehan Aydin; Bülent Tuğrul; Yilmaz Ar

doi:10.53070/bbd.1174033

Araştırma Makalesi

Emotion Recognition System from Speech using Convolutional Neural Networks

Yıl 2022, Cilt: IDAP-2022 : International Artificial Intelligence and Data Processing Symposium , 137 - 143, 10.10.2022

Metehan Aydin , Bülent Tuğrul , Yilmaz Ar

https://doi.org/10.53070/bbd.1174033

Cited By: 1

Öz

Emotions can affect human behaviors directly. This situation makes people want to learn the emotion states of the other people they are in touch. The emotion state information can be used in lots of areas in order to improve efficiency. It is a challenging task and requires a wide working pipeline starting from data acquisition to classification. Today, many researchers work in order to recognize emotions using different techniques including text analyzing, body movement analyzing, facial expressions and voice. In this work, we proposed an approach for this problem. Our approach uses human voice and makes classification using a convolutional neural network. The paper explains how our recognizer pipeline is created and how it works in detail.

Anahtar Kelimeler

emotion recognition, voice recognition, speech recognition, convolutional neural networks

Proje Numarası

Yok

Kaynakça

Andrade, E. B., & Ariely, D. (2009). The enduring impact of transient emotions on decision making. Organizational behavior and human decision processes, 109(1), 1-8.
Tonguç, G., & Ozkara, B. O. (2020). Automatic recognition of student emotions from facial expressions during a lecture. Computers & Education, 148, 103797.
Zepf, S., Hernandez, J., Schmitt, A., Minker, W., & Picard, R. W. (2020). Driver emotion recognition for intelligent vehicles: A survey. ACM Computing Surveys (CSUR), 53(3), 1-30.
César Cavalcanti Roza, V., & Adrian Postolache, O. (2019). Multimodal approach for emotion recognition based on simulated flight experiments. Sensors, 19(24), 5516.
Saste, S. T. & Jagdale, S. M. (2017). Emotion recognition from speech using MFCC and DWT for security system. International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, pp. 701-704.
Subhashini, R., & Niveditha, P. R. (2015). Analyzing and detecting employee's emotion for amelioration of organizations. Procedia Computer Science, 48, 530-536.
Kołakowska, A., Landowska, A., Szwoch, M., Szwoch, W., & Wróbel, M. R. (2013, June). Emotion recognition and its application in software engineering. In 2013 6th International Conference on Human System Interactions (HSI) (pp. 532-539). IEEE.
Kanjo, E., Younis, E. M., & Ang, C. S. (2019). Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection. Information Fusion, 49, 46-56.
Sebe, N., Cohen, I., Gevers, T., & Huang, T. S. (2006, August). Emotion recognition based on joint visual and audio cues. In 18th international conference on pattern recognition (ICPR'06) (Vol. 1, pp. 1136-1139). IEEE.
Kwon, O. W., Chan, K., Hao, J., & Lee, T. W. (2003). Emotion recognition by speech signals. In Eighth European conference on speech communication and technology.
Scherer, K. R. (1996, October). Adding the affective dimension: a new look in speech analysis and synthesis. In ICSLP.
Tato, R., Santos, R., Kompe, R., & Pardo, J. M. (2002). Emotional space improves emotion recognition. In Seventh International Conference on Spoken Language Processing.
Bakır, C., & Yuzkat, M. (2018). Speech emotion classification and recognition with different methods for Turkish language. Balkan Journal of Electrical and Computer Engineering, 6(2), 122-128.
Lim, W., Jang, D., & Lee, T. (2016, December). Speech emotion recognition using convolutional and recurrent neural networks. In 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA) (pp. 1-4). IEEE.
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W. F., & Weiss, B. (2005, September). A database of German emotional speech. In Interspeech (Vol. 5, pp. 1517-1520).
Petrushin, V. A. (2000). Emotion recognition in speech signal: experimental study, development, and application. In Sixth international conference on spoken language processing.
Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), e0196391.
Cao, H., Cooper, D. G., Keutmann, M. K., Gur, R. C., Nenkova, A., & Verma, R. (2014). Crema-d: Crowd-sourced emotional multimodal actors dataset. IEEE transactions on affective computing, 5(4), 377-390.
O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017, August). Understanding of a convolutional neural network. In 2017 international conference on engineering and technology (ICET) (pp. 1-6). IEEE.

Evrişimsel Sinir Ağları ile Konuşmadan Duygu Tanıma Sistemi

Yıl 2022, Cilt: IDAP-2022 : International Artificial Intelligence and Data Processing Symposium , 137 - 143, 10.10.2022

Metehan Aydin , Bülent Tuğrul , Yilmaz Ar

https://doi.org/10.53070/bbd.1174033

Cited By: 1

Öz

Duygular insan davranışlarını doğrudan etkileyebilir. Bu durum kişilerin iletişimde oldukları diğer kişilerin duygu durumlarını öğrenmek istemelerine neden olur. Duygu durumu bilgisi, verimliliği artırmak için birçok alanda kullanılabilir. Bu zorlu bir iştir ve veri toplamadan sınıflandırmaya kadar geniş bir çalışma süreci gerektirir. Günümüzde birçok araştırmacı, metin analizi, vücut hareketi analizi, yüz ifadeleri ve ses gibi farklı teknikleri kullanarak duyguları tanımak için çalışmaktadır. Bu çalışmada, bu problem için bir yaklaşım önerdik. Yaklaşımımız insan sesini ve evrişimsel bir sinir ağını kullanarak sınıflandırma yapar. Makalemiz tanıma sürecinin nasıl oluşturulduğunu ve nasıl çalıştığını ayrıntılı olarak açıklamaktadır.

Anahtar Kelimeler

duygu tanıma, ses tanıma, konuşma tanıma, evrişimsel sinir ağları

Destekleyen Kurum

Yok

Proje Numarası

Yok

Kaynakça

Andrade, E. B., & Ariely, D. (2009). The enduring impact of transient emotions on decision making. Organizational behavior and human decision processes, 109(1), 1-8.
Tonguç, G., & Ozkara, B. O. (2020). Automatic recognition of student emotions from facial expressions during a lecture. Computers & Education, 148, 103797.
Zepf, S., Hernandez, J., Schmitt, A., Minker, W., & Picard, R. W. (2020). Driver emotion recognition for intelligent vehicles: A survey. ACM Computing Surveys (CSUR), 53(3), 1-30.
César Cavalcanti Roza, V., & Adrian Postolache, O. (2019). Multimodal approach for emotion recognition based on simulated flight experiments. Sensors, 19(24), 5516.
Saste, S. T. & Jagdale, S. M. (2017). Emotion recognition from speech using MFCC and DWT for security system. International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, pp. 701-704.
Subhashini, R., & Niveditha, P. R. (2015). Analyzing and detecting employee's emotion for amelioration of organizations. Procedia Computer Science, 48, 530-536.
Kołakowska, A., Landowska, A., Szwoch, M., Szwoch, W., & Wróbel, M. R. (2013, June). Emotion recognition and its application in software engineering. In 2013 6th International Conference on Human System Interactions (HSI) (pp. 532-539). IEEE.
Kanjo, E., Younis, E. M., & Ang, C. S. (2019). Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection. Information Fusion, 49, 46-56.
Sebe, N., Cohen, I., Gevers, T., & Huang, T. S. (2006, August). Emotion recognition based on joint visual and audio cues. In 18th international conference on pattern recognition (ICPR'06) (Vol. 1, pp. 1136-1139). IEEE.
Kwon, O. W., Chan, K., Hao, J., & Lee, T. W. (2003). Emotion recognition by speech signals. In Eighth European conference on speech communication and technology.
Scherer, K. R. (1996, October). Adding the affective dimension: a new look in speech analysis and synthesis. In ICSLP.
Tato, R., Santos, R., Kompe, R., & Pardo, J. M. (2002). Emotional space improves emotion recognition. In Seventh International Conference on Spoken Language Processing.
Bakır, C., & Yuzkat, M. (2018). Speech emotion classification and recognition with different methods for Turkish language. Balkan Journal of Electrical and Computer Engineering, 6(2), 122-128.
Lim, W., Jang, D., & Lee, T. (2016, December). Speech emotion recognition using convolutional and recurrent neural networks. In 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA) (pp. 1-4). IEEE.
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W. F., & Weiss, B. (2005, September). A database of German emotional speech. In Interspeech (Vol. 5, pp. 1517-1520).
Petrushin, V. A. (2000). Emotion recognition in speech signal: experimental study, development, and application. In Sixth international conference on spoken language processing.
Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), e0196391.
Cao, H., Cooper, D. G., Keutmann, M. K., Gur, R. C., Nenkova, A., & Verma, R. (2014). Crema-d: Crowd-sourced emotional multimodal actors dataset. IEEE transactions on affective computing, 5(4), 377-390.
O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017, August). Understanding of a convolutional neural network. In 2017 international conference on engineering and technology (ICET) (pp. 1-6). IEEE.

Toplam 20 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Yapay Zeka
Bölüm	PAPERS
Yazarlar	Metehan Aydin 0000-0001-6189-2661 Bülent Tuğrul 0000-0003-4719-4298 Yilmaz Ar 0000-0003-2370-357X
Proje Numarası	Yok
Yayımlanma Tarihi	10 Ekim 2022
Gönderilme Tarihi	12 Eylül 2022
Kabul Tarihi	16 Eylül 2022
Yayımlandığı Sayı	Yıl 2022 Cilt: IDAP-2022 : International Artificial Intelligence and Data Processing Symposium

Kaynak Göster

APA	Aydin, M., Tuğrul, B., & Ar, Y. (2022). Emotion Recognition System from Speech using Convolutional Neural Networks. Computer Science, IDAP-2022 : International Artificial Intelligence and Data Processing Symposium, 137-143. https://doi.org/10.53070/bbd.1174033

Bilgisayar Bilimleri

Emotion Recognition System from Speech using Convolutional Neural Networks

Öz

Anahtar Kelimeler

Proje Numarası

Kaynakça

Evrişimsel Sinir Ağları ile Konuşmadan Duygu Tanıma Sistemi

Öz

Anahtar Kelimeler

Destekleyen Kurum

Proje Numarası

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

Konuşma Duygu Tanıma Uygulamalarında Hiper Parametre Optimizasyonu ile Derin Öğrenme Metotlarının Geliştirilmesi

Karadeniz Fen Bilimleri Dergisi

https://doi.org/10.31466/kfbd.1508578