A Turkish Broadcast News Speech Database for Investigation the Effect of Deep Neural Network and Long Short Term Memory Hyperparameters on Speech Recognition Based Systems

Serhat Ok; Zekeriya Tüfekci

doi:10.31590/ejosat.900422

Araştırma Makalesi

A Turkish Broadcast News Speech Database for Investigation the Effect of Deep Neural Network and Long Short Term Memory Hyperparameters on Speech Recognition Based Systems

Yıl 2021, Sayı: 24, 87 - 92, 15.04.2021

Serhat Ok , Zekeriya Tüfekci

https://doi.org/10.31590/ejosat.900422

Cited By: 1

Öz

Speech recognition is the transformation of spoken words and sentences into text. There have been many studies on speech recognition in many countries recently. However, studies on speech recognition applications in our country are very few, one of the reasons is the lack of voice dataset. In this study, a Turkish speech database has been developed for Turkish speech recognition based systems. Sound recordings were obtained from news broadcasted by Turkish news tv channels at different times. The created data set was shared on the web in a way that everyone can access in order to set a precedent for other studies. Additionally, the effects of number of layers and number of cells hyperparameters of Long Short Term Memory (LSTM) and Deep Neural Network (DNN) models were investigated on the Turkish Broadcast News Speech Database.

Anahtar Kelimeler

Speech Recognition, Long Short Term Memory, Deep Neural Networks, Turkish Speech Recognition Database

Kaynakça

Bengio, Y., 2009. "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 1–127.
Gaikwad, S., Gawali, B. W., & Yannawar, P. 2010. A review on Speech Recognition Technique. , pp. 16-24
Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 6645-6649). IEEE.
Graves, A., Jaitly, N., & Mohamed, A. R. (2013b, December). Hybrid speech recognition with deep bidirectional LSTM. In 2013 IEEE workshop on automatic speech recognition and understanding (pp. 273-278). IEEE.
Hizlisoy, S., 2020. Music Emotion Recognition Using Convolutional Long Short Memory Deep Neural Networks.
Patlar, F., 2009. A Continuous Speech Recognition System For Turkish Language Based On Triphone Model.
Sepp Hochreiter; Jürgen Schmidhuber (1997). "LSTM can Solve Hard Long Time Lag Problems". Advances in Neural Information Processing Systems 9. Advances in Neural Information Processing Systems. Wikidata Q77698282.
Tüfekci, Z., and Dokuz, Y., 2020. Investigation of the Effect of LSTM Hyperparameters on Speech Recognition Performance , European Journal of Science and Technology: p. 165.
Yu, D., & Deng, L. (2016). Automatic Speech Recognition: A Deep Learning Approach. Springer
Zuo, Z., Shuai, B., Wang, G., Liu, X., Wang, X., Wang, B. (2016). Learning Contextual Dependence with Convolutional Hierarchical Recurrent Neural Networks. IEEE Transactions on Image Processing, 25, 2983-2996.

Derin Sinir Ağları ve Uzun Kısa Süreli Bellek Hiperparametrelerinin Konuşma Tanıma Tabanlı Sistemler Üzerindeki Etkisinin İncelenmesi için Türkçe Yayın Haberleri Konuşma Veri Tabanı

Yıl 2021, Sayı: 24, 87 - 92, 15.04.2021

Serhat Ok , Zekeriya Tüfekci

https://doi.org/10.31590/ejosat.900422

Cited By: 1

Öz

Konuşma tanıma, söylenen kelime ve cümlelerin metne dönüştürülmesidir. Son zamanlarda birçok ülkede konuşma tanıma ile ilgili birçok çalışma yapılmıştır, fakat ülkemizde konuşma tanıma uygulamaları ile ilgili yapılan çalışmalar çok azdır, bunun nedenlerinden biri ses veri seti eksikliğidir. Bu çalışmada, Türkçe konuşma tanıma tabanlı sistemler için bir Türkçe konuşma veri tabanı geliştirilmiştir. Ses kayıtları Türkçe haber tv kanallarının farklı zamanlarda yayınladıkları haberlerden elde edilmiştir. Oluşturulan veri seti diğer çalışmalara da emsal teşkil etmesi açısından herkesin erişebileceği şekilde web ortamında paylaşılmıştır. Ek olarak, katman sayısı ve hücre sayısı hiper parametrelerinin Uzun Kısa Süreli Hafıza (LSTM) ve Derin Sinir Ağı (DNN) modelleri üzerindeki etkisi oluşturduğumuz Türkçe Yayın Haberleri Konuşma veri seti üzerinde incelendi ve karşılaştırıldı.

Anahtar Kelimeler

Konuşma Tanıma, Türkçe Konuşma Tanıma Veriseti, Uzun Kısa Süreli Bellek, Derin Sinir Ağları

Kaynakça

Bengio, Y., 2009. "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 1–127.
Gaikwad, S., Gawali, B. W., & Yannawar, P. 2010. A review on Speech Recognition Technique. , pp. 16-24
Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 6645-6649). IEEE.
Graves, A., Jaitly, N., & Mohamed, A. R. (2013b, December). Hybrid speech recognition with deep bidirectional LSTM. In 2013 IEEE workshop on automatic speech recognition and understanding (pp. 273-278). IEEE.
Hizlisoy, S., 2020. Music Emotion Recognition Using Convolutional Long Short Memory Deep Neural Networks.
Patlar, F., 2009. A Continuous Speech Recognition System For Turkish Language Based On Triphone Model.
Sepp Hochreiter; Jürgen Schmidhuber (1997). "LSTM can Solve Hard Long Time Lag Problems". Advances in Neural Information Processing Systems 9. Advances in Neural Information Processing Systems. Wikidata Q77698282.
Tüfekci, Z., and Dokuz, Y., 2020. Investigation of the Effect of LSTM Hyperparameters on Speech Recognition Performance , European Journal of Science and Technology: p. 165.
Yu, D., & Deng, L. (2016). Automatic Speech Recognition: A Deep Learning Approach. Springer
Zuo, Z., Shuai, B., Wang, G., Liu, X., Wang, X., Wang, B. (2016). Learning Contextual Dependence with Convolutional Hierarchical Recurrent Neural Networks. IEEE Transactions on Image Processing, 25, 2983-2996.

Toplam 10 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Serhat Ok 0000-0002-9764-2952 Zekeriya Tüfekci 0000-0001-7835-2741
Yayımlanma Tarihi	15 Nisan 2021
Yayımlandığı Sayı	Yıl 2021 Sayı: 24

Kaynak Göster

APA	Ok, S., & Tüfekci, Z. (2021). A Turkish Broadcast News Speech Database for Investigation the Effect of Deep Neural Network and Long Short Term Memory Hyperparameters on Speech Recognition Based Systems. Avrupa Bilim Ve Teknoloji Dergisi(24), 87-92. https://doi.org/10.31590/ejosat.900422

Avrupa Bilim ve Teknoloji Dergisi

A Turkish Broadcast News Speech Database for Investigation the Effect of Deep Neural Network and Long Short Term Memory Hyperparameters on Speech Recognition Based Systems

Öz

Anahtar Kelimeler

Kaynakça

Derin Sinir Ağları ve Uzun Kısa Süreli Bellek Hiperparametrelerinin Konuşma Tanıma Tabanlı Sistemler Üzerindeki Etkisinin İncelenmesi için Türkçe Yayın Haberleri Konuşma Veri Tabanı

Öz

Anahtar Kelimeler

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

VPSA-Based Transfer Function Identification of Single DoF Copter System

International Journal of Aviation Science and Technology

https://doi.org/10.23890/IJAST.vm04is02.0204