Year 2017,
Volume: 18 Issue: 3, 584 - 594, 30.09.2017
Hakan Tora
,
İbrahim Baran Uslu
,
Timur Karamehmet
References
- [1] Dutoit T. An Introduction to Text-to-Speech Synthesis. Dordrecht: Kluwer Academic Publishers, 1997.
- [2] Taylor P. Text-to-Speech Synthesis. Cambridge University Press, 2009.
- [3] Aida–Zade KR, Ardil C, Sharifova AM., The main principles of text-to-speech synthesis system, International Journal of Computer, Electrical, Automation, Control and Information Engineering 2013; 7.
- [4] Gercek A., A TMS 5220 based speech synthesis development system, Middle East Technical University, Graduate School of Sciences and Engineering, 1991.
- [5] Bozkurt B, Dutoit T, D'alessandro C, Pagel V, Prudon R., Improving quality of MBROLA synthesis for non-uniform units synthesis, Proceedings of the IEEE TTS Workshop, Santa Monica, 2002.
- [6] Aktan O, Baskaya IF, Dundar G., A single chip solution for text-to-speech synthesis, Proceedings of the European Conf. on Circuit Theory and Design, 2005.
- [7] Sak H, Gungor T, Safkan Y., A corpus based concatenative speech synthesis system for Turkish, Turkish Journal of Electrical Engineering 2006; 14: 209-223.
- [8] Asliyan R, Gunel K., Türkçe metinler için hece tabanlı konuşma sentezleme sistemi, X. Akademik Bilişim Konferansı (in Turkish), 2008.
- [9] Orhan Z, Gormez Z.,A concatenative Turkish text-to-speech system and evaluation process, 6th International Conference on Electrical and Electronics Engineering (ELECO), 2009.
- [10] Tora H, Cengizler Ç., Producing synthetic speech from Turkish text via a single sound synthesizer IC. National Conf. on Electrical, Electronics and Computer Engineering (ELECO) 2010.
- [11] Guner E. and Demiroğlu C. A small footprint hybrid statistical/unit selection text-to-speech synthesis system for agglutinative languages. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012.
- [12] Oskay B, Salor O, Ozkan O, Demirekler M, Ciloğlu T. Intonation abstraction from text and its applications for Turkish sentences. 9th IEEE Signal Processing and Communications Applications Symposium, 2001.
- [13] Sayli O, Arslan LM, Ozsoy AS. Duration properties of the Turkish phonemes. 11th International Conference on Turkish Linguistics (ICTL), KKTC, 2002.
- [14] Kulekci MO, Oflazer K. An infrastructure for Turkish prosody generation in text-to-speech synthesis. 15th Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN), 2006.
- [15] Oflazer K, Inkelas S. The architecture and the implementation of a finite state pronunciation lexicon for Turkish. Computer Speech & Language 2006; 20.
- [16] Uslu IB, Ilk HG, Yilmaz AE. A rule based prosody model for Turkish text-to-speech synthesis. Technical Gazette 2013; 20.
- [17] Yurtay N, Çelebi S, Gunduz AB, Bicil Y. A mobile product recognition system for visually impaired people with Iphone 4. Global Journal on Technology 2013; 3.
- [18] Schröder M, Trouvain J. The German text-to-speech synthesis system MARY: A tool for research, development and teaching. International Journal of Speech Technology 2003; 6: 365-377.
- [19] Akbulut A, Adiguzel T, Yilmaz AE. Statistical syllable analysis for pronunciation ambiguity detection and resolution in text-to-speech synthesis applications: A case study in Turkish. Acta Polytechnica Hungarica 2011, 8.
IMPLEMENTATION OF TURKISH TEXT-TO-SPEECH SYNTHESIS ON A VOICE SYNTHESIZER CARD WITH PROSODIC FEATURES
Year 2017,
Volume: 18 Issue: 3, 584 - 594, 30.09.2017
Hakan Tora
,
İbrahim Baran Uslu
,
Timur Karamehmet
Abstract
This study is on hardware implementation of the Turkish text-to-speech
(TTS) synthesis with a voice synthesizer card. Here, a fully functional TTS
system, capable of synthesizing every Turkish text, including abbreviations,
numbers, etc. is designed and implemented. The system is additionally enriched
by applying some prosodic attributes for more intelligible and natural speech
production. A set of rules required for proper pronunciation and stress
patterns are precisely defined in a lexicon utilized for synthesizing Turkish
speech. Performance of the developed system is assessed by the Mean Opinion
Score (MOS) test. An average score of 3.8 out of 5 is achieved. It indicates
that the proposed synthesizer can be successfully integrated to many practical
Turkish TTS applications.
References
- [1] Dutoit T. An Introduction to Text-to-Speech Synthesis. Dordrecht: Kluwer Academic Publishers, 1997.
- [2] Taylor P. Text-to-Speech Synthesis. Cambridge University Press, 2009.
- [3] Aida–Zade KR, Ardil C, Sharifova AM., The main principles of text-to-speech synthesis system, International Journal of Computer, Electrical, Automation, Control and Information Engineering 2013; 7.
- [4] Gercek A., A TMS 5220 based speech synthesis development system, Middle East Technical University, Graduate School of Sciences and Engineering, 1991.
- [5] Bozkurt B, Dutoit T, D'alessandro C, Pagel V, Prudon R., Improving quality of MBROLA synthesis for non-uniform units synthesis, Proceedings of the IEEE TTS Workshop, Santa Monica, 2002.
- [6] Aktan O, Baskaya IF, Dundar G., A single chip solution for text-to-speech synthesis, Proceedings of the European Conf. on Circuit Theory and Design, 2005.
- [7] Sak H, Gungor T, Safkan Y., A corpus based concatenative speech synthesis system for Turkish, Turkish Journal of Electrical Engineering 2006; 14: 209-223.
- [8] Asliyan R, Gunel K., Türkçe metinler için hece tabanlı konuşma sentezleme sistemi, X. Akademik Bilişim Konferansı (in Turkish), 2008.
- [9] Orhan Z, Gormez Z.,A concatenative Turkish text-to-speech system and evaluation process, 6th International Conference on Electrical and Electronics Engineering (ELECO), 2009.
- [10] Tora H, Cengizler Ç., Producing synthetic speech from Turkish text via a single sound synthesizer IC. National Conf. on Electrical, Electronics and Computer Engineering (ELECO) 2010.
- [11] Guner E. and Demiroğlu C. A small footprint hybrid statistical/unit selection text-to-speech synthesis system for agglutinative languages. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012.
- [12] Oskay B, Salor O, Ozkan O, Demirekler M, Ciloğlu T. Intonation abstraction from text and its applications for Turkish sentences. 9th IEEE Signal Processing and Communications Applications Symposium, 2001.
- [13] Sayli O, Arslan LM, Ozsoy AS. Duration properties of the Turkish phonemes. 11th International Conference on Turkish Linguistics (ICTL), KKTC, 2002.
- [14] Kulekci MO, Oflazer K. An infrastructure for Turkish prosody generation in text-to-speech synthesis. 15th Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN), 2006.
- [15] Oflazer K, Inkelas S. The architecture and the implementation of a finite state pronunciation lexicon for Turkish. Computer Speech & Language 2006; 20.
- [16] Uslu IB, Ilk HG, Yilmaz AE. A rule based prosody model for Turkish text-to-speech synthesis. Technical Gazette 2013; 20.
- [17] Yurtay N, Çelebi S, Gunduz AB, Bicil Y. A mobile product recognition system for visually impaired people with Iphone 4. Global Journal on Technology 2013; 3.
- [18] Schröder M, Trouvain J. The German text-to-speech synthesis system MARY: A tool for research, development and teaching. International Journal of Speech Technology 2003; 6: 365-377.
- [19] Akbulut A, Adiguzel T, Yilmaz AE. Statistical syllable analysis for pronunciation ambiguity detection and resolution in text-to-speech synthesis applications: A case study in Turkish. Acta Polytechnica Hungarica 2011, 8.