Çağrı Merkezlerinde Olumsuzluk İçeren Çağrıların Evrişimsel Sinir Ağları ile Tespiti
Year 2023,
Volume: 16 Issue: 1, 13 - 19, 31.01.2023
Ali Fatih Karataş
,
Öykü Berfin Mercan
,
Umut Özdil
,
Şükrü Ozan
Abstract
Bu çalışmada çağrı merkezi çalışanları ile müşteriler arasındaki telefon konuşmalarının otomatik olarak olumlu veya olumsuz şeklinde değerlendirilmesi üzerine odaklanılmıştır. Çalışmada kullanılan veri seti firma bünyesinde gerçekleştirilen telefon görüşmelerinden oluşmaktadır. Veri seti üçer saniyelik 10411 adet ses kaydını içermekte olup bu kayıtların 5408 tanesi olumlu kayıtlardan 5003 tanesi münakaşa, öfke ve hakaret içeren olumsuz kayıtlardan oluşmaktadır. Çağrı merkezi kayıtlarından duygu tanıma için anlamlı öznitelikler elde etmek amacıyla her bir ses kaydından MFCC öznitelikleri çıkarılmıştır. Çağrı merkezi kayıtlarını olumlu olumsuz olarak sınıflandırmak için önerilen CNN mimarisi MFCC öznitelikleriyle eğitilmiştir. Önerilen CNN modeli %86,1 eğitim başarısı, %77,3 doğrulama başarısı göstermiş olup test verileri üzerinde %69,4 sınıflandırma başarısı elde edilmiştir. Bu çalışma ile çağrı merkezlerinde gerçekleşen konuşmaların otomatik analizi yapılıp olumsuz durumların kalite yöneticilerine bildirilmesiyle gerekli önlemlerin alınarak müşteri memnuniyetinin artırılması amaçlanmaktadır.
Supporting Institution
TÜBİTAK
References
- B. Özlan, A. Haznedaroğlu, L. M. Arslan. "Automatic fraud detection in call center conversations", Signal Processing and Communications Applications Conference (SIU), Sivas, Türkiye, 27, 2019.
- L. O. Iheme, Ş. Ozan, "A novel semi-supervised framework for call center agent malpractice detection via neural feature learning", Expert Systems with Applications, 118173, 2022.
- D. Pappas, I. Androutsopoulos, H. Papageorgiou, "Anger Detection in Call Center Dialogues", 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Macaristan, 139-144, 2015.
- Ş. Ozan, "Classification of Audio Segments in Call Center Recordings using Convolutional Recurrent Neural Networks", arXiv preprint arXiv:2106.02422, 2021.
- D. Turnbull, C. Elkan, "Fast Recognition of Musical Genres Using RBF Networks", IEEE Transactions on Knowledge and Data Engineering, 17(4), 580-584, 2005.
- L. O. Iheme, Ş. Ozan, "Multiclass Digital Audio Segmentation with MFCC Features using Naive Bayes and SVM Classifiers", Innovations in Intelligent Systems and Applications Conference (ASYU), İzmir, Türkiye, 1-5, 2019.
- M. S. Likitha, S. S. R. Gupta, K. Hasitha, A. U. Raju, "Speech Based Human Emotion Recognition using MFCC", International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Hindistan, 2257-2260, 2017.
A. Milton, S. S. Roy, S. T. Selvi, "SVM Scheme for Speech Emotion Recognition using MFCC Feature", International Journal of Computer Applications, 69(9),34-39, 2013.
- V. B Waghmare, R. R. Deshmukh, P. P. Shrishrimal, G. B. Janvale, "Emotion Recognition System from Artificial Marathi Speech using MFCC and LDA Techniques", Fifth International Conference on Advances in Communication, Network, and Computing–CNC, Hindistan, 2014.
- S. Demircan, H. Kahramanlı, "Feature Extraction from Speech Data for Emotion Recognition", Journal of Advances in Computer Networks, 28-30, 2014.
- N. J. Nalini, S. Palanivel, "Music emotion recognition: The combined evidence of MFCC and residual phase", Egyptian Informatics Journal, 17(1), 1-10, 2016.
- W. Lim, D. Jang, T. Lee. "Speech Emotion Recognition using Convolutional and Recurrent Neural Networks", Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Kore 1-4, 2016.
- J. Zhao, X. Mao, L Chen, "Speech Emotion Recognition using Deep 1D & 2D CNN LSTM Networks", Biomedical Signal Processing and Control, 47, 312-323, 2019.
- S. Latif, R. Rana, S. Khalifa, R. Jurdak, J. Epps, “Direct Modelling of Speech Emotion from Raw Speech”, arXiv preprint ArXiv:1904.03833, 2019.
- J. Wang, M. Xue, R Culhane, E. Diao, J. Ding, V. Tarokh, "Speech Emotion Recognition with Dual-Sequence LSTM Architecture", ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6474-6478, 2020.
- S. K. Pandey, H. S. Shekhawat, S. R. M. Prasanna, "Deep Learning Techniques for Speech Emotion Recognition: A Review", 29th International Conference Radioelektronika (RADIOELEKTRONIKA), Çek Cumhuriyeti, 1-6, 2019.
- F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss “A Database of German Emotional Speech”, Interspeech, 1517-1520, 2005.
- S. Wu, T. H. Falk, W. Y. Chan, “Automatic Speech Emotion Recognition using Modulation Spectral Features”, Speech Communication, 53(5), 768-785, 2011.
- M. Grimm, K. Kroschel, S. Narayanan, “The Vera am Mittag German Audio-Visual Emotional Speech Database”, IEEE International Conference on Multimedia and Expo, Almanya, 865-868, 2008.
- M. Yıldırım, “MFCC Yöntemi ve Önerilen Derin Model ile Çevresel Seslerin Otomatik Olarak Sınıflandırılması”, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 34(1), 449-457, 2022.
- M. Scarpiniti, D. Comminiello, A. Uncini, Y. C. Lee” Deep Recurrent Neural Networks for Audio Classification in Construction Sites” 28th European Signal Processing Conference (EUSIPCO), Hollanda, 810- 814, 2020.
- S. K. Roy, G. Krishna, S. R. Dubey, B. B. Chaudhuri, "HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification", IEEE Geoscience and Remote Sensing Letters, 17(2), 277-281, 2019.
- H. Wang, J. Zhou, C. Gu, H. Lin, "Design of Activation Function in CNN for Image Classification", Journal of Zhejiang University (Engineering Science), 53(7), 1363-1373, 2019.
- Y. Wang, Y. Li, Y. Song, X. Rong, "The Influence of the Activation Function in a Convolution Neural Network Model of Facial Expression Recognition", Applied Sciences, 10(5), 1897, 2020.
- M. A. Kizrak, B. Bolat, “Derin Öğrenme ile Kalabalık Analizi Üzerine Detaylı Bir Araştırma”, Bilişim Teknolojileri Dergisi, 11(3), 263-286, 2018.
Detection of Negative Calls in Call Centers with Convolutional Neural Networks
Year 2023,
Volume: 16 Issue: 1, 13 - 19, 31.01.2023
Ali Fatih Karataş
,
Öykü Berfin Mercan
,
Umut Özdil
,
Şükrü Ozan
Abstract
In this study, it is focused on the automatic evaluation of telephone conversations between call center employees and customers as positive or negative. The dataset used in the study include telephone conversations between call center employees and customers in the company. The data set contains 10411 three-second call center records; 5408 of them are positive records and 5003 of them are negative records that include arguments, anger and insults. In order to obtain meaningful features for emotion recognition from voice records, MFCC features were extracted from each call center records. The proposed CNN architecture is trained with MFCC features to classify call center records as positive or negative. The proposed CNN model showed 86.1% training accuracy, 77.3% validation accuracy and it achieved 69.4% classification accuracy on the test data. This study aimed to increase customer satisfaction by automatic analysis of conversations in call centers and notifying quality managers of negative records.
References
- B. Özlan, A. Haznedaroğlu, L. M. Arslan. "Automatic fraud detection in call center conversations", Signal Processing and Communications Applications Conference (SIU), Sivas, Türkiye, 27, 2019.
- L. O. Iheme, Ş. Ozan, "A novel semi-supervised framework for call center agent malpractice detection via neural feature learning", Expert Systems with Applications, 118173, 2022.
- D. Pappas, I. Androutsopoulos, H. Papageorgiou, "Anger Detection in Call Center Dialogues", 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Macaristan, 139-144, 2015.
- Ş. Ozan, "Classification of Audio Segments in Call Center Recordings using Convolutional Recurrent Neural Networks", arXiv preprint arXiv:2106.02422, 2021.
- D. Turnbull, C. Elkan, "Fast Recognition of Musical Genres Using RBF Networks", IEEE Transactions on Knowledge and Data Engineering, 17(4), 580-584, 2005.
- L. O. Iheme, Ş. Ozan, "Multiclass Digital Audio Segmentation with MFCC Features using Naive Bayes and SVM Classifiers", Innovations in Intelligent Systems and Applications Conference (ASYU), İzmir, Türkiye, 1-5, 2019.
- M. S. Likitha, S. S. R. Gupta, K. Hasitha, A. U. Raju, "Speech Based Human Emotion Recognition using MFCC", International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Hindistan, 2257-2260, 2017.
A. Milton, S. S. Roy, S. T. Selvi, "SVM Scheme for Speech Emotion Recognition using MFCC Feature", International Journal of Computer Applications, 69(9),34-39, 2013.
- V. B Waghmare, R. R. Deshmukh, P. P. Shrishrimal, G. B. Janvale, "Emotion Recognition System from Artificial Marathi Speech using MFCC and LDA Techniques", Fifth International Conference on Advances in Communication, Network, and Computing–CNC, Hindistan, 2014.
- S. Demircan, H. Kahramanlı, "Feature Extraction from Speech Data for Emotion Recognition", Journal of Advances in Computer Networks, 28-30, 2014.
- N. J. Nalini, S. Palanivel, "Music emotion recognition: The combined evidence of MFCC and residual phase", Egyptian Informatics Journal, 17(1), 1-10, 2016.
- W. Lim, D. Jang, T. Lee. "Speech Emotion Recognition using Convolutional and Recurrent Neural Networks", Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Kore 1-4, 2016.
- J. Zhao, X. Mao, L Chen, "Speech Emotion Recognition using Deep 1D & 2D CNN LSTM Networks", Biomedical Signal Processing and Control, 47, 312-323, 2019.
- S. Latif, R. Rana, S. Khalifa, R. Jurdak, J. Epps, “Direct Modelling of Speech Emotion from Raw Speech”, arXiv preprint ArXiv:1904.03833, 2019.
- J. Wang, M. Xue, R Culhane, E. Diao, J. Ding, V. Tarokh, "Speech Emotion Recognition with Dual-Sequence LSTM Architecture", ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6474-6478, 2020.
- S. K. Pandey, H. S. Shekhawat, S. R. M. Prasanna, "Deep Learning Techniques for Speech Emotion Recognition: A Review", 29th International Conference Radioelektronika (RADIOELEKTRONIKA), Çek Cumhuriyeti, 1-6, 2019.
- F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss “A Database of German Emotional Speech”, Interspeech, 1517-1520, 2005.
- S. Wu, T. H. Falk, W. Y. Chan, “Automatic Speech Emotion Recognition using Modulation Spectral Features”, Speech Communication, 53(5), 768-785, 2011.
- M. Grimm, K. Kroschel, S. Narayanan, “The Vera am Mittag German Audio-Visual Emotional Speech Database”, IEEE International Conference on Multimedia and Expo, Almanya, 865-868, 2008.
- M. Yıldırım, “MFCC Yöntemi ve Önerilen Derin Model ile Çevresel Seslerin Otomatik Olarak Sınıflandırılması”, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 34(1), 449-457, 2022.
- M. Scarpiniti, D. Comminiello, A. Uncini, Y. C. Lee” Deep Recurrent Neural Networks for Audio Classification in Construction Sites” 28th European Signal Processing Conference (EUSIPCO), Hollanda, 810- 814, 2020.
- S. K. Roy, G. Krishna, S. R. Dubey, B. B. Chaudhuri, "HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification", IEEE Geoscience and Remote Sensing Letters, 17(2), 277-281, 2019.
- H. Wang, J. Zhou, C. Gu, H. Lin, "Design of Activation Function in CNN for Image Classification", Journal of Zhejiang University (Engineering Science), 53(7), 1363-1373, 2019.
- Y. Wang, Y. Li, Y. Song, X. Rong, "The Influence of the Activation Function in a Convolution Neural Network Model of Facial Expression Recognition", Applied Sciences, 10(5), 1897, 2020.
- M. A. Kizrak, B. Bolat, “Derin Öğrenme ile Kalabalık Analizi Üzerine Detaylı Bir Araştırma”, Bilişim Teknolojileri Dergisi, 11(3), 263-286, 2018.