BibTex RIS Cite

Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination

Year 2011, Volume: 11 Issue: 1, 1355 - 1362, 28.03.2012

Abstract

The speech/music discrimination systems have gaining importance in several intelligent audio retrieval algorithms due to the increasing size of the multimedia sources in our daily lives. This study aims to propose a speech/music discrimination system which utilizes the advantages of the wavelet transform. Also, the performance of the discrete wavelet transform and the dual- tree wavelet transform has been compared with the conventional time, frequency and cepstral domain features used in speech/music discrimination. The speech and music samples collected from common databases, CD recording and internet radios have been classified with artificial neural networks with different feature sets. The principal component analysis has been applied to eliminate the correlated features before classification stage. Considering the number of vanishing moments and orthogonality, the best performance has been obtained with Daubechies8 wavelet among the other members of the Daubechies family. According to the results, the proposed feature set outperforms the traditional ones.
Keywords: Speech/music discrimination, Discrete wavelet transform, Dual-tree wavelet transform, Daubechies mother wavelet.

References

  • Ambikairajah, O. M. E., Epps, J., “Novel features for effective speech and music discrimination,” in Proc. IEEE Int. Conf. on Engineering of Intelligent Systems, pp. 1–5, 2006.
  • Exposito, N. R. J.E.M., Galan, S.G., Candeas, P., “Audio coding improvement using evolutionary speech/music discrimination,” in Proc. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), pp. 1–6, 2007.
  • El-Maleh, K., Petrucci, M. G., Kabal, P., “Speech/music discrimination for multimedia applications,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2445–2448, 2000.
  • Gedik, A., Bozkurt, B., “Pitch frequency histogram based music information retrieval for turkish music,” Signal Processing, vol. 10, pp. 1049–1063, 2010.
  • Saunders, J., “Real time discrimination of broadcast speech/music,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 993–996, 1996.
  • Scheier, E., Slaney, M., “Construction and evaluation of a robust multifeature speech/music discriminator,” in Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, ICASSP’97, pp. 1331–1334, 1997
  • Ajmera, I. M. J., Bourlard, H., “Speech/music segmentation using entropy and dynamism features in a HMM classification framework,” Speech Communication, vol. 40, pp. 351–363, 2003.
  • Panagiotakis, C., Tziritas, G., “A speech/music discriminator based on RMS and zero-crossings,” IEEE Trans. Multimedia, vol. 7, pp. 155–166, 2005.
  • Tzanetakis, G. E. G., Cook, P., “Audio analysis using the discrete wavelet transform,” in Proc. Conf. in Acoustics and Music Theory Applications. WSES, pp. 318–323, 2001.
  • Didiot E., Illina, I., Fohr, D., Mella, O., “A wavelet- based parameterization for speech/music discrimination,” Computer Speech and Language, vol. 24, pp. 341–357, 2010. [11] Ntalampiras, S., Fakotakis, N., “Speech /music discrimination based on discrete wavelet transform,” in Proc. of 5th Hell. Conf. On Art.Int., SETN’08, LNAI 5138, Greece, Oct. 2008, pp. 205–211, 2008
  • Khan, M., Al-Khatib, W., “Machine-learning based classiŞcation of speech and music,” ACM Jour. on Multimedia Systems, vol. 12, pp. 55–67, 2006.
  • Mallat, S., A wavelet tour of signal processing. Academic Press, 1999
  • Zheng, F., Zhang, G., Song, Z., “Comparison of different implemantations of mfcc,” Arch. Rat. Mech. Anal., vol. 16, pp. 582–589, 2001.
  • Selesnick, I.W., Baraniuk, R.G., Kingsbury, N.G. “The Dual-Tree ComplexWavelet Transform”, IEEE Sig.Proc. Mag. 22, pp. 123–151, 2005.
  • Kingsbury, N.G., “The dual-tree complex wavelet transform: a new technique for shift invariance and directional Şlters”, Proc. of the IEEE Digital Signal Processing Workshop, 1998.
  • Düzenli, T., (2010). Classification of Speech and Musical Signals Using Wavelet Domain Features, MSc. Thesis submitted to Dokuz Eylül University, Graduate School Of Natural And Applied Sciences.
  • Charalambous, C., Conjugate gradient algorithm for efficient training of artificial neural networks. IEEE Proceedings-G on Circuit Devices and System, 139 (3), pp. 301- 310, 1992
  • A. Toker, S. Özcan, H. Kuntman, O. Çiçekoğlu, “Supplementary all-pass sections with reduced number of passive elements using a single current conveyor”, Int J of Electronics, vol.88, pp.969-976,2001.
  • U. Çam, O. Çiçekoğlu, M. Gülsoy, H. Kuntman, “New voltage and current mode first-order all-pass filters using single FTFN”, Frequenz, vol.7-8, pp.177-179,2000.
  • R. Schauman, M. E. Valkenburg, “Design of analog filters”, Oxford University Press, New York, 2001.
  • Nalan Özkurt received her B.S., M.S. and Ph.D. degree in Electrical
  • Engineering from the Dokuz Eylul University, in 1994, 1998 and 2004, respectively. She is currently an assistant professor in the Department of Electrical Engineering at
  • Yaşar University. Her research interests are wavelets, nonlinear static and dynamical systems, chaos. She is a member of Association of Electrical and Electronic Engineers of Turkey.
  • Timur Düzenli received his B.S. in 2007 and his M.S. in 2010, both
  • Electronics Engineering, from Dokuz Eylul University. He is currently a Ph.D. student at the same department. Her research interests are wavelets, time- frequency analysis, and digital communication systems. and
Year 2011, Volume: 11 Issue: 1, 1355 - 1362, 28.03.2012

Abstract

References

  • Ambikairajah, O. M. E., Epps, J., “Novel features for effective speech and music discrimination,” in Proc. IEEE Int. Conf. on Engineering of Intelligent Systems, pp. 1–5, 2006.
  • Exposito, N. R. J.E.M., Galan, S.G., Candeas, P., “Audio coding improvement using evolutionary speech/music discrimination,” in Proc. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), pp. 1–6, 2007.
  • El-Maleh, K., Petrucci, M. G., Kabal, P., “Speech/music discrimination for multimedia applications,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2445–2448, 2000.
  • Gedik, A., Bozkurt, B., “Pitch frequency histogram based music information retrieval for turkish music,” Signal Processing, vol. 10, pp. 1049–1063, 2010.
  • Saunders, J., “Real time discrimination of broadcast speech/music,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 993–996, 1996.
  • Scheier, E., Slaney, M., “Construction and evaluation of a robust multifeature speech/music discriminator,” in Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, ICASSP’97, pp. 1331–1334, 1997
  • Ajmera, I. M. J., Bourlard, H., “Speech/music segmentation using entropy and dynamism features in a HMM classification framework,” Speech Communication, vol. 40, pp. 351–363, 2003.
  • Panagiotakis, C., Tziritas, G., “A speech/music discriminator based on RMS and zero-crossings,” IEEE Trans. Multimedia, vol. 7, pp. 155–166, 2005.
  • Tzanetakis, G. E. G., Cook, P., “Audio analysis using the discrete wavelet transform,” in Proc. Conf. in Acoustics and Music Theory Applications. WSES, pp. 318–323, 2001.
  • Didiot E., Illina, I., Fohr, D., Mella, O., “A wavelet- based parameterization for speech/music discrimination,” Computer Speech and Language, vol. 24, pp. 341–357, 2010. [11] Ntalampiras, S., Fakotakis, N., “Speech /music discrimination based on discrete wavelet transform,” in Proc. of 5th Hell. Conf. On Art.Int., SETN’08, LNAI 5138, Greece, Oct. 2008, pp. 205–211, 2008
  • Khan, M., Al-Khatib, W., “Machine-learning based classiŞcation of speech and music,” ACM Jour. on Multimedia Systems, vol. 12, pp. 55–67, 2006.
  • Mallat, S., A wavelet tour of signal processing. Academic Press, 1999
  • Zheng, F., Zhang, G., Song, Z., “Comparison of different implemantations of mfcc,” Arch. Rat. Mech. Anal., vol. 16, pp. 582–589, 2001.
  • Selesnick, I.W., Baraniuk, R.G., Kingsbury, N.G. “The Dual-Tree ComplexWavelet Transform”, IEEE Sig.Proc. Mag. 22, pp. 123–151, 2005.
  • Kingsbury, N.G., “The dual-tree complex wavelet transform: a new technique for shift invariance and directional Şlters”, Proc. of the IEEE Digital Signal Processing Workshop, 1998.
  • Düzenli, T., (2010). Classification of Speech and Musical Signals Using Wavelet Domain Features, MSc. Thesis submitted to Dokuz Eylül University, Graduate School Of Natural And Applied Sciences.
  • Charalambous, C., Conjugate gradient algorithm for efficient training of artificial neural networks. IEEE Proceedings-G on Circuit Devices and System, 139 (3), pp. 301- 310, 1992
  • A. Toker, S. Özcan, H. Kuntman, O. Çiçekoğlu, “Supplementary all-pass sections with reduced number of passive elements using a single current conveyor”, Int J of Electronics, vol.88, pp.969-976,2001.
  • U. Çam, O. Çiçekoğlu, M. Gülsoy, H. Kuntman, “New voltage and current mode first-order all-pass filters using single FTFN”, Frequenz, vol.7-8, pp.177-179,2000.
  • R. Schauman, M. E. Valkenburg, “Design of analog filters”, Oxford University Press, New York, 2001.
  • Nalan Özkurt received her B.S., M.S. and Ph.D. degree in Electrical
  • Engineering from the Dokuz Eylul University, in 1994, 1998 and 2004, respectively. She is currently an assistant professor in the Department of Electrical Engineering at
  • Yaşar University. Her research interests are wavelets, nonlinear static and dynamical systems, chaos. She is a member of Association of Electrical and Electronic Engineers of Turkey.
  • Timur Düzenli received his B.S. in 2007 and his M.S. in 2010, both
  • Electronics Engineering, from Dokuz Eylul University. He is currently a Ph.D. student at the same department. Her research interests are wavelets, time- frequency analysis, and digital communication systems. and
There are 25 citations in total.

Details

Primary Language English
Journal Section Articles
Authors

Timur Düzenli This is me

Nalan Özkurt

Publication Date March 28, 2012
Published in Issue Year 2011 Volume: 11 Issue: 1

Cite

APA Düzenli, T., & Özkurt, N. (2012). Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering, 11(1), 1355-1362.
AMA Düzenli T, Özkurt N. Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering. March 2012;11(1):1355-1362.
Chicago Düzenli, Timur, and Nalan Özkurt. “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”. IU-Journal of Electrical & Electronics Engineering 11, no. 1 (March 2012): 1355-62.
EndNote Düzenli T, Özkurt N (March 1, 2012) Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering 11 1 1355–1362.
IEEE T. Düzenli and N. Özkurt, “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”, IU-Journal of Electrical & Electronics Engineering, vol. 11, no. 1, pp. 1355–1362, 2012.
ISNAD Düzenli, Timur - Özkurt, Nalan. “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”. IU-Journal of Electrical & Electronics Engineering 11/1 (March 2012), 1355-1362.
JAMA Düzenli T, Özkurt N. Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering. 2012;11:1355–1362.
MLA Düzenli, Timur and Nalan Özkurt. “Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination”. IU-Journal of Electrical & Electronics Engineering, vol. 11, no. 1, 2012, pp. 1355-62.
Vancouver Düzenli T, Özkurt N. Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination. IU-Journal of Electrical & Electronics Engineering. 2012;11(1):1355-62.