Comparison of Different Normalization Techniques on Speakers’ Gender Detection

Serhat İleri; Armağan Karabina; Erdal Kılıç

doi:10.31200/makuubd.410625

Konferans Bildirisi

Konuşmacı Cinsiyetinin Tespitinde Değişik Normalizasyon Tekniklerinin Kıyaslanması

Yıl 2018, , 1 - 12, 30.09.2018

Serhat İleri , Armağan Karabina , Erdal Kılıç

https://doi.org/10.31200/makuubd.410625

Cited By: 2

Öz

Bu çalışmada Kısa-zaman
Ortalama ve Değişinti Normalizasyonu (Short-time Mean and Variance
Normalization - STMVN), Kısa-zaman Sepstral Ortalama ve Ölçeklendirme
Normalizasyonu (Short-time Cepstral Mean and Scale Normalization - STMSN),
Asgari – Azami (Min-Max) Normalizasyonu, Z-Skor (Z-Score) Normalizasyonu ve
Standart Sapma (Standard Deviation) Normalizasyon tekniklerinin, konuşmacı
cinsiyetinin tespitinde sınıflandırma başarımına etkisi araştırılmıştır.
Çalışmada veri seti olarak TIMIT veri setindeki 192 erkek ve 192 kadın
konuşmacıya ait ses kayıtları kullanılmıştır. Ses kayıtlarından Mel Frekansı
Sepstral Katsayısı (Mel Frequency Cepstral Coefficient – MFCC) tekniği ile
öznitelik çıkarılmış ve çıkarılan özniteliklerin boyutu Temel Bileşen
Analizi (Principal component analysis – PCA) ile indirgenerek, değişik
teknikler ile normalize edilmiştir. Sınıflandırıcı olarak Destek Vektör Makinesi (Support Vector Machine – SVM) kullanılmıştır.
Çalışma sonucunda konuşmacı cinsiyeti tahmininde en yüksek başarımın %98.18 ile
Standart Sapma Normalizasyon Tekniği ile normalize edilmiş özniteliklerden
elde edildiği gözlemlenmiş olup diğer tekniklerin başarımı düşürdüğü
gözlemlenmiştir.

Anahtar Kelimeler

Asgari – Azami Normalizasyonu, Z-Skor Normalizasyonu, Standart Sapma Normalizasyonu, Kısa-zaman Ortalama ve Değişinti Normalizasyonu, Kısa-zaman Sepstral Ortalama ve Ölçeklendirme Normalizasyonu

Kaynakça

Alam, M. J. vd. (2011) Comparative evaluation of feature normalization techniques for speaker verification. International Conference on Nonlinear Speech Processing. Springer Berlin Heidelberg
Chen, O. T-C. & Gu, J. J. (2015) Improved gender/age recognition system using arousal-selection and feature-selection schemes. Digital Signal Processing (DSP), 2015 IEEE International Conference on. IEEE
Djemili, R. vd. (2012)A speech signal based gender identification system using four classifiers. Multimedia Computing and Systems (ICMCS), 2012 International Conference on. IEEE
Durukal, M. & Hocaoğlu A. K. (2015) Performance optimization on emotion recognition from speech. Signal Processing and Communications Applications Conference (SIU), 2015 23th. IEEE
Heerden C. vd. (2010) Combining regression and classification methods for improving automatic speaker age recognition. Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE
Islam, M. A. (2016). GFCC-based robust gender detection. In Innovations in Science, Engineering and Technology (ICISET), International Conference on. IEEE.
Khanum, S., & Firos, A. (2017). Text independent gender identification in noisy environmental conditions. In Computing, Communication and Automation (ICCCA), 2017 International Conference on. IEEE.
Kizrak, M. A. & Bolat, B. (2914) Klasik Türk Müziği Makamlarının Tanınması. Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu (ASYU) 2-6.
Nabiyev, V. V. & Yücesoy, E. (2009) VQ Yöntemiyle Konuşmacı Cinsiyetinin Belirlenmesi. Turkish Journal of Computer and Mathematics Education Vol 1.1, 35-47.
Přibil, J. vd. (2016) GMM-Based Speaker Gender and Age Classification After Voice Conversion. Sensing, Processing and Learning for Intelligent Machines (SPLINE), 2016 First International Workshop on. IEEE
Yücesoy, E. & Nabiyev, V. V. (2014) Comparison of MFCC, LPCC and PLP features for the determination of a speaker's gender. Signal Processing and Communications Applications Conference (SIU), 2014 22nd. IEEE
Yusnita, M. A. vd. (2017) Automatic gender recognition using linear prediction coefficients and artificial neural network on speech signal. In Control System, Computing and Engineering (ICCSCE), 2017 7th IEEE International Conference on. 2017
Yücesoy, E. & Nabiyev, V. V. (2009) Gender identification of the speaker using DTW method. Signal Processing and Communications Applications Conference, SIU 2009. IEEE 17th. IEEE
Yücesoy, E. & Nabiyev, V. V. (2016) Konuşmacı Yaş Ve Cinsiyetinin Gkm Süpervektörlerine Dayalı Bir Dvm Sınıflandırıcısı İle Belirlenmesi. Gazi Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi 31.3

Comparison of Different Normalization Techniques on Speakers’ Gender Detection

Yıl 2018, , 1 - 12, 30.09.2018

Serhat İleri , Armağan Karabina , Erdal Kılıç

https://doi.org/10.31200/makuubd.410625

Cited By: 2

Öz

In this study, the effect of Short-time Mean and Variance Normalization
(STMVN), Short-time Cepstral Mean and
Scale Normalization (STMSN), Min-Max Normalization, Z-Score Normalization and
Standard Deviation Normalization techniques on the classification
performance was investigated in determining speakers’ gender. In the study,
voice records which belongs to 192 male and 192 female speakers from TIMIT data
set were used as data set. Features were extracted from Mel Frequency Cepstral
Coefficients (MFCC) technique by using voice records and extracted
features’ dimension was reduced to Principal Component Analysis (PCA), then
normalized with different techniques. Support Vector Machine (SVM) was
used as classifier. As a result of study, it was observed that, the highest accuracy
in speakers’ gender estimation is obtained as %98.18 from features which were
normalized with Standard Deviation Normalization technique and other
normalization techniques were reduced accuracy.

Anahtar Kelimeler

Max-Min Normalization, Z-Score Normalization, Standard Deviation Normalization, Short-time Mean and Variance Normalization, Short-time Cepstral Mean and Scale Normalization

Kaynakça

Alam, M. J. vd. (2011) Comparative evaluation of feature normalization techniques for speaker verification. International Conference on Nonlinear Speech Processing. Springer Berlin Heidelberg
Chen, O. T-C. & Gu, J. J. (2015) Improved gender/age recognition system using arousal-selection and feature-selection schemes. Digital Signal Processing (DSP), 2015 IEEE International Conference on. IEEE
Djemili, R. vd. (2012)A speech signal based gender identification system using four classifiers. Multimedia Computing and Systems (ICMCS), 2012 International Conference on. IEEE
Durukal, M. & Hocaoğlu A. K. (2015) Performance optimization on emotion recognition from speech. Signal Processing and Communications Applications Conference (SIU), 2015 23th. IEEE
Heerden C. vd. (2010) Combining regression and classification methods for improving automatic speaker age recognition. Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE
Islam, M. A. (2016). GFCC-based robust gender detection. In Innovations in Science, Engineering and Technology (ICISET), International Conference on. IEEE.
Khanum, S., & Firos, A. (2017). Text independent gender identification in noisy environmental conditions. In Computing, Communication and Automation (ICCCA), 2017 International Conference on. IEEE.
Kizrak, M. A. & Bolat, B. (2914) Klasik Türk Müziği Makamlarının Tanınması. Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu (ASYU) 2-6.
Nabiyev, V. V. & Yücesoy, E. (2009) VQ Yöntemiyle Konuşmacı Cinsiyetinin Belirlenmesi. Turkish Journal of Computer and Mathematics Education Vol 1.1, 35-47.
Přibil, J. vd. (2016) GMM-Based Speaker Gender and Age Classification After Voice Conversion. Sensing, Processing and Learning for Intelligent Machines (SPLINE), 2016 First International Workshop on. IEEE
Yücesoy, E. & Nabiyev, V. V. (2014) Comparison of MFCC, LPCC and PLP features for the determination of a speaker's gender. Signal Processing and Communications Applications Conference (SIU), 2014 22nd. IEEE
Yusnita, M. A. vd. (2017) Automatic gender recognition using linear prediction coefficients and artificial neural network on speech signal. In Control System, Computing and Engineering (ICCSCE), 2017 7th IEEE International Conference on. 2017
Yücesoy, E. & Nabiyev, V. V. (2009) Gender identification of the speaker using DTW method. Signal Processing and Communications Applications Conference, SIU 2009. IEEE 17th. IEEE
Yücesoy, E. & Nabiyev, V. V. (2016) Konuşmacı Yaş Ve Cinsiyetinin Gkm Süpervektörlerine Dayalı Bir Dvm Sınıflandırıcısı İle Belirlenmesi. Gazi Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi 31.3

Toplam 14 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Serhat İleri 0000-0002-0259-0791 Armağan Karabina Erdal Kılıç
Yayımlanma Tarihi	30 Eylül 2018
Kabul Tarihi	12 Nisan 2018
Yayımlandığı Sayı	Yıl 2018

Kaynak Göster

APA	İleri, S., Karabina, A., & Kılıç, E. (2018). Comparison of Different Normalization Techniques on Speakers’ Gender Detection. Mehmet Akif Ersoy Üniversitesi Uygulamalı Bilimler Dergisi, 2(2), 1-12. https://doi.org/10.31200/makuubd.410625

Mehmet Akif Ersoy Üniversitesi Uygulamalı Bilimler Dergisi

Konuşmacı Cinsiyetinin Tespitinde Değişik Normalizasyon Tekniklerinin Kıyaslanması

Öz

Anahtar Kelimeler

Kaynakça

Comparison of Different Normalization Techniques on Speakers’ Gender Detection

Öz

Anahtar Kelimeler

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

Stock Price Prediction Using Long-Short-Term Memory Network

Mehmet Akif Ersoy Üniversitesi Uygulamalı Bilimler Dergisi

https://doi.org/10.31200/makuubd.1164099

Gender Determination in Human Voice Signals using Synaptic Efficacy Function-based Leaky Integrate and Fire Neuron Model

Bitlis Eren Üniversitesi Fen Bilimleri Dergisi

https://doi.org/10.17798/bitlisfen.1024236