Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi

Hüseyin Ahmetoğlu; Resul Daş

doi:10.19113/sdufenbed.645579

Araştırma Makalesi

Investigation of Word Vector Models Trained with Turkish Hotel Comments by Sentiment Analysis

Yıl 2020, Cilt: 24 Sayı: 2, 455 - 463, 26.08.2020

Hüseyin Ahmetoğlu , Resul Daş

https://doi.org/10.19113/sdufenbed.645579

Cited By: 3

Öz

One of the important research areas of Natural Language Processing and text classification is sentiment analysis. Studies in this area are growing rapidly. This technique manifests itself in all kinds of applications of digital life. There are many techniques developed for sentiment analysis, but recently, word embedding methods of natural language processing have become widely used in sentiment analysis. Word2Vec is one of the most useful word embedding methods that can convert words into meaningful vectors. In order to create word vectors with this method, large word pools are needed. Pre-trained models make it possible to achieve more accurate results in sentiment analysis. In this study, Turkish hotel reviews of approved users were collected by data scraping methods for examination of sentiment analysis. Obtained from the original data by training with Word2Vec word vectors were created. With these vectors, a classification model has been developed with Gated Recurrent Unit which is a kind of Recurrent Neural Networks. The vectors formed by assigning random values to wider corpus-trained word vectors were re-examined with the same deep learning method and the obtained classification successes were compared. According to the results, it was observed that the broader corpus independent of the private area increased the success of classification.

Anahtar Kelimeler

Natural language processing, Data scraping, Sentiment analysis, Recurrent neural networks, Word2Vec, Word embeddings

Kaynakça

[1] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations ofwords and phrases and their compositionality,” in Advances in Neural Information Processing Systems, 2013.
[2] Z. Hailong, G.Wenyan, and J. Bo, “Machine learning and lexicon based methods for sentiment classification: A survey,” in Proceedings - 11th Web Information System and Application Conference, WISA 2014, pp. 262–265, 2014.
[3] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Engineering Journal, vol. 5, no. 4, pp. 1093–1113, 2014.
[4] M. Hu and B. Liu, “Mining and summarizing customer reviews,” in KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177, 2004.
[5] X. Ding, B. Liu, and P. S. Yu, “A holistic lexiconbased approach to opinion mining,” in WSDM’08 - Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 231–239, 2008.
[6] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon-basedmethods for sentiment analysis,” Computational Linguistics, vol. 37, no. 2, pp. 267–307, 2011.
[7] O. Araque, I. Corcuera-Platas, J. F. Sánchez-Rada, and C. A. Iglesias, “Enhancing deep learning sentiment analysis with ensemble techniques in social applications,” Expert Systems with Applications, vol. 77, pp. 236–246, 2017.
[8] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, and B. Qin, “Learning sentiment-specific word embedding for twitter sentiment classification,” in 52nd Annual Meeting of the Association for Computational 462 H.Ahmetoğlu, R.Da¸s / Duygu Analizi Linguistics, ACL 2014 - Proceedings of the Conference, vol. 1, pp. 1555–1565, 2014.
[9] A. Severyn and A. Moschitti, “UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification,” pp. 464–469, the 38th International ACM SIGIR Conference, 2015.
[10] X. Fu, W. Liu, Y. Xu, and L. Cui, “Combine HowNet lexicon to train phrase recursive autoencoder for sentence-level sentiment analysis,” Neurocomputing, vol. 241, pp. 18–27, 2017.
[11] P. Qin, W. Xu, and J. Guo, “An empirical convolutional neural network approach for semantic relation classification,” Neurocomputing, vol. 190, pp. 1–9, 2016.
[12] Y. Kim, “Convolutional neural networks for sentence classification,” in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1746–1751, 2014.
[13] S. M. Rezaeinia, R. Rahmani, A. Ghodsi, and H. Veisi, “Sentiment analysis based on improved pre-trained word embeddings,” Expert Systems with Applications, vol. 117, pp. 139–147, 2019.
[14] Y. Wang, M. Huang, xiaoyan Zhu, and L. Zhao, “Attention-based LSTM for Aspect-level Sentiment Classification,” pp. 606–615, 2016.
[15] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[16] Beautiful Soup, “Beautiful soup documentation.” https://www.crummy.com/software/BeautifulSoup/ bs4/doc/, 2019. [Online; accessed 12-October-2019].
[17] Stokastik, “Understanding word vectors and word2vec.” https://www.stokastik.in/ understanding-word-vectors-and-word2vec/, 2019. [Online; accessed 12-October-2019].
[18] H. Ahmetoğlu and R. Da¸s, “Derin Ö˘grenme ile büyük veri kumelerinden saldırı türlerinin sınıflandırılması,” in 2019 International Artificial Intelligence and Data Processing Symposium (IDAP), pp. 1–9, Sep. 2019.
[19] Shervine Amidi-Stanford University, “Recurrent neural networks.” https://stanford.edu/~shervine/l/en/teaching/cs-230/cheatsheet-recurrent-neural-networks, 2019. [Online; accessed 12-October-2019].
[20] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1724–1734, 2014.
[21] R. Rehurek and P. Sojka, “Software Framework for Topic Modelling with Large Corpora,” in Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, (Valletta, Malta), pp. 45–50, ELRA, May 2010. http://is.muni.cz/publication/884893/en.
[22] L. Van Der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, pp. 2579–2625, 2008.
[23] F. Chollet et al., “Keras.” https://keras.io, 2015. [Online;accessed 12-October-2019].
[24] W. Contributors, “Wikimedia downloads.” https://dumps.wikimedia.org/, 2019. [Online; accessed 12-October-2019].

Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi

Yıl 2020, Cilt: 24 Sayı: 2, 455 - 463, 26.08.2020

Hüseyin Ahmetoğlu , Resul Daş

https://doi.org/10.19113/sdufenbed.645579

Cited By: 3

Öz

Doğal dil işlemenin(Natural Language Processing-NLP) ve metin sınıflandırmanın önemli araştırma alanlarından biri de duygu analizidir. Bu alanda çalışmalar hızla büyümektedir. Bu teknik dijital yaşamın her çeşit uygulama alanında kendini göstermektedir. Duygu analizi için geliştirilen birçok teknik vardır ancak son zamanlarda doğal dil işlemenin kelime vektör modeli metotları duygu analizinde yaygın olarak kullanılmaya başlamıştır. Word2Vec kelimeleri anlamlı vektörlere dönüştürebilen en kullanışlı kelime vektör modeli yöntemleri arasındadır. Bu yöntem ile kelime vektörleri oluşturabilmek için büyük kelime havuzlarına ihtiyaç vardır. Önceden eğitilmiş modeller duygu analizinde daha doğru sonuçlara ulaşabilmeyi mümkün kılarlar. Bu çalışmada duygu analizinde incelenmek üzere, onaylanmış kullanıcıların Türkçe otel yorumları veri kazıma yöntemleri ile toplanmıştır. Elde edilen bu özgün veriler Word2Vec ile eğitilerek kelime vektörleri oluşturulmuştur. Bu vektörler ile tekrarlanan yapay sinir ağının (Recurrent Neural Networks-RNN) bir çeşidi olan geçitli tekrarlayan birimler (Gated Recurrent Unit-GRU) ile bir sınıflandırma modeli geliştirilmiştir. Daha geniş kelime torbalarıyla eğitilmiş kelime vektörleri ile rastgele değerler atanarak oluşturulan vektörler, aynı derin öğrenme yöntemiyle yeniden incelenmiş ve elde edilen sınıflandırma başarıları karşılaştırılmıştır. Elde edilen sonuçlara göre özel alandan bağımsız, daha geniş kapsamlı kelime torbalarının sınıflandırma başarısını arttırdığı gözlemlenmiştir.

Anahtar Kelimeler

Doğal dil işleme, Veri kazıma, Duygu analizi, Yinelenen yapay sinir ağı, Word2Vec, Kelime Gömme

Kaynakça

[1] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations ofwords and phrases and their compositionality,” in Advances in Neural Information Processing Systems, 2013.
[2] Z. Hailong, G.Wenyan, and J. Bo, “Machine learning and lexicon based methods for sentiment classification: A survey,” in Proceedings - 11th Web Information System and Application Conference, WISA 2014, pp. 262–265, 2014.
[3] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Engineering Journal, vol. 5, no. 4, pp. 1093–1113, 2014.
[4] M. Hu and B. Liu, “Mining and summarizing customer reviews,” in KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177, 2004.
[5] X. Ding, B. Liu, and P. S. Yu, “A holistic lexiconbased approach to opinion mining,” in WSDM’08 - Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 231–239, 2008.
[6] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon-basedmethods for sentiment analysis,” Computational Linguistics, vol. 37, no. 2, pp. 267–307, 2011.
[7] O. Araque, I. Corcuera-Platas, J. F. Sánchez-Rada, and C. A. Iglesias, “Enhancing deep learning sentiment analysis with ensemble techniques in social applications,” Expert Systems with Applications, vol. 77, pp. 236–246, 2017.
[8] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, and B. Qin, “Learning sentiment-specific word embedding for twitter sentiment classification,” in 52nd Annual Meeting of the Association for Computational 462 H.Ahmetoğlu, R.Da¸s / Duygu Analizi Linguistics, ACL 2014 - Proceedings of the Conference, vol. 1, pp. 1555–1565, 2014.
[9] A. Severyn and A. Moschitti, “UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification,” pp. 464–469, the 38th International ACM SIGIR Conference, 2015.
[10] X. Fu, W. Liu, Y. Xu, and L. Cui, “Combine HowNet lexicon to train phrase recursive autoencoder for sentence-level sentiment analysis,” Neurocomputing, vol. 241, pp. 18–27, 2017.
[11] P. Qin, W. Xu, and J. Guo, “An empirical convolutional neural network approach for semantic relation classification,” Neurocomputing, vol. 190, pp. 1–9, 2016.
[12] Y. Kim, “Convolutional neural networks for sentence classification,” in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1746–1751, 2014.
[13] S. M. Rezaeinia, R. Rahmani, A. Ghodsi, and H. Veisi, “Sentiment analysis based on improved pre-trained word embeddings,” Expert Systems with Applications, vol. 117, pp. 139–147, 2019.
[14] Y. Wang, M. Huang, xiaoyan Zhu, and L. Zhao, “Attention-based LSTM for Aspect-level Sentiment Classification,” pp. 606–615, 2016.
[15] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[16] Beautiful Soup, “Beautiful soup documentation.” https://www.crummy.com/software/BeautifulSoup/ bs4/doc/, 2019. [Online; accessed 12-October-2019].
[17] Stokastik, “Understanding word vectors and word2vec.” https://www.stokastik.in/ understanding-word-vectors-and-word2vec/, 2019. [Online; accessed 12-October-2019].
[18] H. Ahmetoğlu and R. Da¸s, “Derin Ö˘grenme ile büyük veri kumelerinden saldırı türlerinin sınıflandırılması,” in 2019 International Artificial Intelligence and Data Processing Symposium (IDAP), pp. 1–9, Sep. 2019.
[19] Shervine Amidi-Stanford University, “Recurrent neural networks.” https://stanford.edu/~shervine/l/en/teaching/cs-230/cheatsheet-recurrent-neural-networks, 2019. [Online; accessed 12-October-2019].
[20] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1724–1734, 2014.
[21] R. Rehurek and P. Sojka, “Software Framework for Topic Modelling with Large Corpora,” in Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, (Valletta, Malta), pp. 45–50, ELRA, May 2010. http://is.muni.cz/publication/884893/en.
[22] L. Van Der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, pp. 2579–2625, 2008.
[23] F. Chollet et al., “Keras.” https://keras.io, 2015. [Online;accessed 12-October-2019].
[24] W. Contributors, “Wikimedia downloads.” https://dumps.wikimedia.org/, 2019. [Online; accessed 12-October-2019].

Toplam 24 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Hüseyin Ahmetoğlu 0000-0002-4320-0198 Resul Daş 0000-0002-6113-4649
Yayımlanma Tarihi	26 Ağustos 2020
Yayımlandığı Sayı	Yıl 2020 Cilt: 24 Sayı: 2

Kaynak Göster

APA	Ahmetoğlu, H., & Daş, R. (2020). Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 24(2), 455-463. https://doi.org/10.19113/sdufenbed.645579
AMA	Ahmetoğlu H, Daş R. Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi. Süleyman Demirel Üniv. Fen Bilim. Enst. Derg. Ağustos 2020;24(2):455-463. doi:10.19113/sdufenbed.645579
Chicago	Ahmetoğlu, Hüseyin, ve Resul Daş. “Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi Ile İncelenmesi”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 24, sy. 2 (Ağustos 2020): 455-63. https://doi.org/10.19113/sdufenbed.645579.
EndNote	Ahmetoğlu H, Daş R (01 Ağustos 2020) Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 24 2 455–463.
IEEE	H. Ahmetoğlu ve R. Daş, “Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi”, Süleyman Demirel Üniv. Fen Bilim. Enst. Derg., c. 24, sy. 2, ss. 455–463, 2020, doi: 10.19113/sdufenbed.645579.
ISNAD	Ahmetoğlu, Hüseyin - Daş, Resul. “Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi Ile İncelenmesi”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 24/2 (Ağustos 2020), 455-463. https://doi.org/10.19113/sdufenbed.645579.
JAMA	Ahmetoğlu H, Daş R. Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi. Süleyman Demirel Üniv. Fen Bilim. Enst. Derg. 2020;24:455–463.
MLA	Ahmetoğlu, Hüseyin ve Resul Daş. “Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi Ile İncelenmesi”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, c. 24, sy. 2, 2020, ss. 455-63, doi:10.19113/sdufenbed.645579.
Vancouver	Ahmetoğlu H, Daş R. Türkçe Otel Yorumlarıyla Eğitilen Kelime Vektörü Modellerinin Duygu Analizi ile İncelenmesi. Süleyman Demirel Üniv. Fen Bilim. Enst. Derg. 2020;24(2):455-63.

Cited By

COVID-19 Pandemisinin Türkiye Mobil Oyun Pazarına Etkisi: Bir Metin Madenciliği Uygulaması

Journal of Turkish Operations Management

https://doi.org/10.56554/jtom.1284249

Topluluk Öğrenmesi Algoritmaları Kullanarak Amazon Yemek Yorumları Üzerine Duygu Analizi

Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi

https://doi.org/10.35193/bseufbd.1300732

Türkçe Metinlerde Duygu Analizi

Journal of Yaşar University

Sinem TOKCAER

https://doi.org/10.19168/jyasar.928843

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

e-ISSN :1308-6529
Linking ISSN (ISSN-L): 1300-7688

Dergide yayımlanan tüm makalelere ücretiz olarak erişilebilinir ve Creative Commons CC BY-NC Atıf-GayriTicari lisansı ile açık erişime sunulur. Tüm yazarlar ve diğer dergi kullanıcıları bu durumu kabul etmiş sayılırlar. CC BY-NC lisansı hakkında detaylı bilgiye erişmek için tıklayınız.