İlgi Sıralamalarının Artırımlı Olarak Geliştirilmesi: Pennant Erişimle Desteklenen Yeni Bir Yöntem Önerisi

Müge Akbulut; Yaşar Tonta

doi:10.24146/tk.1062751

Araştırma Makalesi

İlgi Sıralamalarının Artırımlı Olarak Geliştirilmesi: Pennant Erişimle Desteklenen Yeni Bir Yöntem Önerisi

Yıl 2022, Cilt: 36 Sayı: 2, 169 - 203, 30.06.2022

Müge Akbulut , Yaşar Tonta

https://doi.org/10.24146/tk.1062751

Öz

Amaç: İlgi sıralaması algoritmaları erişilen belgeleri arama sorgularıyla belgeler arasındaki konusal benzerlik (ilgi) derecelerine göre sıralar. Bu çalışmanın amacı; bir olasılıksal konu modelleme algoritması ile atıf verilerine dayanan “pennant erişim”in birleşiminden oluşan yeni bir ilgi sıralaması yöntemi geliştirmektir. Veri Kaynakları ve Yöntem: Geliştirdiğimiz yöntemi yaklaşık 435 bin fizik makalesinden oluşan iSearch derlemi üzerinde uyguladık. Önce 65 sorgu için derlemdeki tüm makalelerin başlıkları ve özetleri üzerinde konu modelleme algoritmasını çalıştırarak ilgi sıralamalarını elde ettik. Daha sonra pennant erişim yöntemini uygulayarak elde ettiğimiz atıf bilgilerini mevcut ilgi sıralamalarını tümleştirmek (fusion) ve daha da geliştirmek için kullandık. Böylece hem aranan konunun farklı yönlerini kapsayan hem de konuyla marjinal ilgili olan makalelerden oluşan daha iyi ilgi sıralamaları elde ettik. Maksimum Marjinal İlgi (MMR, Maximum Marginal Relevance) algoritmasının farklı ilgi sıralamaları üzerindeki etkilerini ayrı ayrı inceleyerek önerdiğimiz yöntemin erişim performansını değerlendirdik. Bulgular: Bulgular konu modelleme algoritması ile elde edilen ilgi sıralamalarında makalelerin başlıklarında ve özetlerinde geçen bazı terimlerin bazen göz ardı edilebildiğini göstermektedir. Ama bu sıralamalar atıf verilerine dayanan pennant erişimle desteklendiğinde, kullanılan terimlerin bağlamları hakkında ek bilgiler elde edilmekte ve sonuçta ilgi düzeyleri daha yüksek ve çeşitli (interdisipliner) makaleler içeren daha zenginleştirilmiş ilgi sıralamaları oluşturulmaktadır. Dahası, erişim çıktıları araştırmacıların önceliklerine göre kolayca yeniden sıralanabilmektedir (kişiselleştirme). Sonuç: Önerdiğimiz yöntemde pennant erişim tekniklerini kullanarak mevcut ilgi sıralaması algoritmalarının artırımlı olarak iyileştirilmesi üzerinde odaklandık. Bu yöntemin hesaplama yükü, sağlamlık, tekrarlanabilirlik ve ölçeklenebilirlik açılarından dinamik derlemler üzerinde sınandıktan sonra zamanla TR-Dizin, Web of Science ve Scopus gibi hem yerel hem de uluslararası bilgi sistemlerinde de kullanılabileceği kanısındayız. Özgünlük: Bu araştırmada yeni bir ilgi sıralaması yöntemi önerilmektedir. Bildiğimiz kadarıyla bu çalışma, LDA konu modelleme algoritması ile elde edilen ilgi sıralamalarının atıf verilerine dayanan pennant erişim teknikleriyle artırımlı olarak geliştirilebileceğini gösteren ilk çalışmadır.

Anahtar Kelimeler

İlgi sıralamaları, olasılıksal konu modellemesi, Gizli Dirichlet Paylaştırma (LDA) algoritması, pennant erişim, Maksimum Marjinal İlgi (MMR)

Teşekkür

iSearch derlemiyle ilgili yardımları için iSearch Team’e (Peter Ingwersen, Birger Larsen, Haakon Lund ve Marianne Lykke), çalışmanın önceki sürümünü okuyarak değerli önerilerde bulunan Prof. Dr. Umut Al ve Prof. Dr. Fazlı Can’a teşekkür ederiz.

Kaynakça

Abramo, G., D’Angelo, C. A. ve Zhang, L. (2018). A comparison of two approaches for measuring interdisciplinary research output: The disciplinary diversity of authors vs the disciplinary diversity of the reference list. Journal of Informetrics, 12(4), 1182-1193. https://doi.org/10.1016/j.joi.2018.09.001
Adomavicius, G. ve Kwon, Y. (2011). Improving aggregate recommendation diversity using ranking-based techniques. IEEE Transactions on Knowledge and Data Engineering, 24(5), 896-911. https://doi.org/10.1109/TKDE.2011.15
ADS Team (2008). SAO/NASA ADS Abstract Service Stopword List. https://adsabs.harvard.edu/abs_doc/stopwords.html
Akbulut, M. (2016). Atıf klasiklerinin etkisinin ve ilgililik sıralamalarının pennant diyagramları ile analizi [Yayımlanmamış yüksek lisans tezi]. Hacettepe Üniversitesi. https://hdl.handle.net/11655/3529
Akbulut, M., Tonta, Y. ve White, H. D. (2020). Related records retrieval and pennant retrieval: An exploratory case study. Scientometrics, 122(2), 957-987. https://doi.org/10.1007/s11192-019-03303-9
Arun, R., Suresh, V., Madhavan, C. V. ve Murthy, M. N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. Pacific-Asia Conference on Knowledge Discovery and Data Mining içinde (s. 391-402). Springer. https://doi.org/10.1007/978-3-642-13657-3_43
Baeza-Yates, R. ve Ribeiro-Neto, B. (1999). Modern information retrieval. ACM Press.
Ballester, O. ve Penner, O. (2022). Robustness, replicability and scalability in topic modelling. Journal of Informetrics, 16(1). https://doi.org/10.1016/j.joi.2021.101224
Bayer, D. ve Michael, S. (2019). Exploring the daschle collection using text mining. arXiv. https://arxiv.org/pdf/1904.12623.pdf
Beel, J. ve Gipp, B. (2009). Google Scholar’s ranking algorithm: An introductory overview. B. Larsen ve J. Leta (Yay. haz.). Proceedings of the 12th International Conference on Scientometrics and Informetrics içinde (s. 230-241). International Society for Scientometrics and Informetrics. https://www.issi-society.org/proceedings/issi_2009/ISSI2009-proc-vol1_Aug2009_batch2-paper-1.pdf
Beel, J., Gipp, B., Langer, S. ve Breitinger, C. (2016). Research-paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4): 305-338.
Belter, C. W. (2017). A relevance ranking method for citation-based search results. Scientometrics, 112(2), 731-746. https://doi.org/10.1007/s11192-017-2406-y
Bichteler, J. ve Eaton III, E.A. (1980). The combined use of bibliographic coupling and cocitation for document retrieval. Journal of the American Society for Information Science, 31(4): 278–282.
Blei, D. M. ve Lafferty, J. D. (2009). Topic models. A. Srivastava ve M. Sahami (Yay. haz.). Text Mining: Classification, Clustering and Applications içinde (s. 71-94). CRC Press, Taylor & Francis. http://www.cs.columbia.edu/~blei/papers/BleiLafferty2009.pdf
Blei, D. M., Ng, A. Y. ve Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993-1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf?TB_iframe=true&width=370.8&height=658.8
Bornmann, L., Haunschild, R. ve Mutz, R. (2021). Growth rates of modern science: A latent piecewise growth curve approach to model publication numbers from established and new literature databases. Humanities and Social Sciences Communications, 8(1), 1-15. https://doi.org/10.1057/s41599-021-00903-w
Boyd-Graber, J. ve Blei, D. M. (2010). Syntactic topic models. arXiv. https://arxiv.org/pdf/1002.4665.pdf
Bradley, K. ve Smyth, B. (2001). Improving recommendation diversity. D. O'Donoghue (Yay. haz.) Proceedings of the Twelfth Irish Conference on Artificial Intelligence and Cognitive Science içinde (s. 141-152). NUIM Department of Computer Science. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.5232&rep=rep1&type=pdf
Cambria, E. ve White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48-57. https://doi.org/10.1109/MCI.2014.2307227
Cao, J., Xia, T., Li, J., Zhang, Y. ve Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7-9), 1775-1781. https://doi.org/10.1016/j.neucom.2008.06.011
Carbonell, J. ve Goldstein, J. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval içinde (s. 335-336). Association for Computing Machinery. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.188.3982&rep=rep1&type=pdf
Carevic, Z. ve Mayr, P. (2014). Recommender systems using pennant diagrams in digital libraries. arXiv. https://arxiv.org/pdf/1407.7276v1.pdf
Carevic, Z. ve Schaer, P. (2014). On the connection between citation-based and topical relevance ranking: results of a pretest using iSearch. Proceedings of the First Workshop on Bibliometric-enhanced Information Retrieval co-located with 36th European Conference on Information Retrieval (ECIR 2014) içinde (s. 37-44). Springer-Verlag. https://ceur-ws.org/Vol-1143/paper5.pdf
Carroll, M. (2018). Changes in media coverage of GCSEs from 1988 to 2017. Cambridge. https://www.cambridgeassessment.org.uk/Images/504456-changes-in-media-coverage-of-gcses-from-1988-to-2017.pdf
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L. ve Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems içinde (s. 288-296). MIT Press. https://proceedings.neurips.cc/paper/2009/file/f92586a25bb3145facd64ab20fd554ff-Paper.pdf
Chen, M. ve Décary, M. (2018). A Cognitive-based semantic approach to deep content analysis in search engines. 2018 IEEE 12th International Conference on Semantic Computing (ICSC) içinde (s. 131-139). IEEE. https://doi.ieeecomputersociety.org/10.1109/ICSC.2018.00027
Chen, Z. ve Liu, B. (2014). Mining topics in documents: Standing on the shoulders of big data. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1116-1125). ACM. https://dl.acm.org/doi/pdf/10.1145/2623330.2623622
Cooper, W. S. (1988). Getting beyond boole. Information Processing & Management, 24(3), 243-248. https://doi.org/10.1016/0306-4573(88)90091-X
Croft W. B. (2002). Combining approaches to information retrieval. W.B. Croft (Yay. haz.). Advances in Information Retrieval. The Information Retrieval Series, vol 7. içinde (s. 1-35). Springer, https://doi.org/10.1007/0-306-47019-5_1
Danilov, M. (2005). Experimental review on pentaquarks. arXiv. https://arxiv.org/abs/hep-ex/0509012
Deveaud, R., SanJuan, E. ve Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document Numérique, 17(1), 61-84. https://doi.org/10.3166/dn.17.1.61-84
Ekinci, E. ve İlhan Omurca, S. (2020). Concept-LDA: Incorporating Babelfy into LDA for aspect extraction. Journal of Information Science, 46(3), 406-418. https://doi.org/10.1177/0165551519845854
Ganguly, D. ve Jones, G. J. (2018). A non-parametric topical relevance model. Information Retrieval Journal, 21(5), 449-479. https://doi.org/10.1007/s10791-018-9329-y
Giustolisi, O., Ridolfi, L. ve Simone, A. (2020). Embedding the intrinsic relevance of vertices in network analysis: the case of centrality metrics. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-60151-x
Gläser, J., Glänzel, W. ve Scharnhorst, A. (2017). Same data—different results? Towards a comparative approach to the identification of thematic structures in science. Scientometrics, 111(2), 981-998. https://doi.org/10.1007/s11192-017-2296-z
Griffiths, T. L. ve Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(1), 5228-5235. https://doi.org/10.1073/pnas.0307752101
Guillemette, J., Simms, B., Zhou, D. ve Mills, S. (2017). Applying latent dirichlet allocation to yelp reviews. https://people.math.carleton.ca/~smills/2017-18/STAT4601-5703/Research%20Projects/2018%20Submissions/GuillemetteSimmsZhouD/Applying%20LDA.pdf
Guo, J., Fan, Y., Ai, Q. ve Croft, W. B. (2016). A deep relevance matching model for ad-hoc retrieval. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management içinde (s. 55-64). ACM. https://doi.org/10.1145/2983323.2983769
Guo, Z., Zhang, Z. M., Zhu, S., Chi, Y. ve Gong, Y. (2013). A two-level topic model towards knowledge discovery from citation networks. IEEE Transactions on Knowledge and Data Engineering, 26(4), 780-794. https://doi.org/10.1109/TKDE.2013.56
Han, X. (2020). Evolution of research topics in LIS between 1996 and 2019: an analysis based on latent dirichlet allocation topic model. Scientometrics, 125(3), 2561-2595. https://doi.org/10.1007/s11192-020-03721-0
Hecking, T. ve Leydesdorff, L. (2018). Topic modelling of empirical text corpora: Validity, reliability, and reproducibility in comparison to semantic maps. arXiv. https://arxiv.org/pdf/1806.01045.pdf
Herlocker, J.L., Konstan, J. A., Terveen, L. G. ve Riedl, J. T. (2004) Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5-53. https://doi.org/10.1145/963770.963772
Holliger, T. S. (2018). Strategic sourcing via category management: Helping air force installation contracting agency eat one piece of the elephant [Yayımlanmamış yüksek lisans tezi]. Air Force Institute of Technology. https://apps.dtic.mil/sti/pdfs/AD1056353.pdf
Huang, L., Liu, H., He, J. ve Du, X. (2016). Finding latest influential research papers through modeling two views of citation links. F. Li, K. Shim, K. Zheng ve G. Liu (Yay. haz.) Web Technologies and Applications APWeb 2016 içinde (s. 555-566). Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_45
Huang, X., Chen, C., Peng, C., Wu, X., Fu, L. ve Wang, X. (2018). Topic-sensitive influential paper discovery in citation network. D. Phung, V. Tseng, G. Webb, B. Ho, M. Ganji ve L. Rashidi (Yay. haz.). Advances in Knowledge Discovery and Data Mining içinde (s. 16-28). Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_2
Jin, R., Valizadegan, H. ve Li, H. (2008). Ranking refinement and its application to information retrieval. Proceedings of the 17th International Conference on World Wide Web içinde (s. 397-406). ACM. http://doi.org/10.1145/1367497.1367552
Ke, Q., Ferrara, E., Radicchi, F. ve Flammini, A. (2015). Defining and identifying Sleeping Beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7426-7431. https://doi.org/10.1073/pnas.1424329112
Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14(1): 10-25
Knoth, P., Anastasiou, L., Charalampous, A., Cancellieri, M., Pearce, S., Pontika, N. ve Bayer, V. (2017). Towards effective research recommender systems for repositories. arXiv. https://arxiv.org/abs/1705.00578
Kucuktunc, O. ve Ferhatosmanoglu, H. (2011). λ-diverse nearest neighbors browsing for multidimensional data. IEEE Transactions on Knowledge and Data Engineering, 25(3), 481-493. https://doi.org/10.1109/TKDE.2011.251
Küçüktunç, O., Saule, E., Kaya, K. ve Çatalyürek, Ü. V. (2015). Diversifying citation recommendations. ACM Transactions on Intelligent Systems and Technology, 5(4), 1-21. https://doi.org/10.1145/2668106
Lei, M., Wang, J., Chen, B. ve Li, X. (2001). Improved relevance ranking in WebGather. Journal of Computer Science and Technology, 16(5), 410-417. https://doi.org/10.1007/bf02948958
Leydesdorff, L. ve Nerghes, A. (2017). Co‐word maps and topic modeling: A comparison using small and medium‐sized corpora (N< 1,000). Journal of the Association for Information Science and Technology, 68(4), 1024-1035. https://doi.org/10.1002/asi.23740
Li, C., Feng, H. ve Rijke, M. D. (2020). Cascading hybrid bandits: online learning to rank for relevance and diversity. Fourteenth ACM Conference on Recommender Systems içinde (s. 33-42). ACM. https://doi.org/10.1145/3383313.3412245
Li, W. ve McCallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. Proceedings of the 23rd International Conference on Machine Learning içinde (s. 577-584). Springer. https://doi.org/10.1145/1143844.1143917
Li, Y., He, J. ve Liu, H. (2017). Topic analysis and influential paper discovery on scientific publications. 2017 14th Web Information Systems and Applications Conference (WISA) içinde (s. 68-73). IEEE. https://doi.org/10.1109/WISA.2017.69
Liu, X., Wang, G. ve Zakirul Alam Bhuiyan, M. (2022). Re‐ranking with multiple objective optimization in recommender system. Transactions on Emerging Telecommunications Technologies, 33(1): e4398 https://doi.org/10.1002/ett.4398
Liu, X. Z. ve Fang, H. (2020). A comparison among citation-based journal indicators and their relative changes with time. Journal of Informetrics, 14(1), 1-17. https://doi.org/10.1016/j.joi.2020.101007
Lykke, M., Larsen, B., Lund, H. ve Ingwersen, P. (2010). Developing a test collection for the evaluation of integrated search. European Conference on Information Retrieval içinde (s. 627-630). Springer. https://doi.org/10.1007/978-3-642-12275-0_63
Ma, Z., Liu, Y., Yang, Z., Yang, J. ve Li, K. (2022). A parameter-free approach to lossless summarization of fully dynamic graphs. Information Sciences, 589, 376-394. https://doi.org/10.1016/j.ins.2021.12.116
Mahajan, M., Beeferman, D. ve Huang, X. D. (1999). Improved topic-dependent language modeling using information retrieval techniques. 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings içinde (s. 541-544). IEEE. https://doi.org/10.1109/ICASSP.1999.758182
Manning, C. ve Schütze, H. (2000). Foundations of statistical natural language processing. MIT Press. https://ics.upjs.sk/~pero/web/documents/pillar/Manning_Schuetze_Statistical NLP.pdf
Maron, M. E. ve Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM, 7(3), 216-244. https://doi.org/10.1145/321033.321035
Marujo, L., Ribeiro, R., Gershman, A., De Matos, D.M., Neto, J.P. ve Carbonell, J. (2017). Event-based summarization using a centrality-as-relevance model. Knowledge and Information Systems, 50, 945–968. https://doi.org/10.1007/s10115-016-0966-4
Mayr, P. ve Mutschke, P. (2013). Bibliometric-enhanced retrieval models for big scholarly information systems. 2013 IEEE International Conference on Big Data içinde (s. 5-8). IEEE. https://doi.org/10.1109/BigData.2013.6691762
McNee, S. M., Riedl, J. ve Konstan, J. A. (2006). Being accurate is not enough: how accuracy metrics have hurt recommender systems. CHI'06 extended abstracts on human factors in computing systems içinde (s. 1097-1101). https://doi.org/10.1145/1125451.1125659
Meng, W., Yu, C. ve Liu, K. L. (2002). Building efficient and effective metasearch engines. ACM Computing Surveys (CSUR), 34(1), 48-89. https://doi.org/10.1145/505282.505284
Mizzaro, S. (1997). Relevance: The whole history. Journal of the American Society for Information Science, 48, 810-832. https://doi.org/10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO;2-U
Nguyen, T. ve Do, P. (2018). CitationLDA++ an extension of LDA for discovering topics in document network. Proceedings of the Ninth International Symposium on Information and Communication Technology içinde (s. 31-37). ACM. https://doi.org/10.1145/3287921.3287930
Nikita, M. (2020, 20 Nisan). Select number of topics for LDA. https://cran.r-project.org/web/packages/ldatuning/vignettes/topics.html
Nolasco, D. ve Oliveira, J. (2016). Detecting knowledge innovation through automatic topic labeling on scholar data. 2016 49th Hawaii International Conference on System Sciences (HICSS) içinde (s. 358-367). IEEE. https://doi.org/10.1109/HICSS.2016.51
Pao, M. L. (1993). Term and citation retrieval: A field study. Information Processing & Management. 29(1), 95-112. https://doi.org/10.1016/0306-4573(93)90026-A
Ponweiser, M. (2012). Latent dirichlet allocation in R. [Yayımlanmamış yüksek lisans tezi]. Viyana Üniversitesi. https://epub.wu.ac.at/id/eprint/3558
Rafols, I., Leydesdorff, L., O’Hare, A., Nightingale, P. ve Stirling, A. (2012). How journal rankings can suppress interdisciplinary research: A comparison between innovation studies and business & management. Research Policy, 41(7), 1262-1282. https://doi.org/10.1016/j.respol.2012.03.015
Ribeiro, R., ve de Matos, D.M. (2011). Revisiting Centrality-as-relevance: support sets and similarity as geometric proximity. Journal of Artificial Intelligence Research, 42, 275-308. https://doi.org/10.1613/jair.3387
Robertson, S. E. (1977). The probability ranking principle in IR. Journal of Documentation, 33(4), 294-304. https://doi.org/10.1108/eb026647
Rüdiger, M. S., Antons, D. ve Salge, T. O. (2021). The explanatory power of citations: a new approach to unpacking impact in science. Scientometrics, 126, 9779-9809. https://doi.org/10.1007/s11192-021-04103-w
Salton, G., Yang, C. ve Wong, A. (1975). A vector space model for automatic indexing. Communications of the ACM, 18, 613-620. https://doi.org/10.1145/361219.361220
Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141-156. https://doi.org/10.1016/j.esp.2002.10.001
Saracevic, T. (2021). Relevance: In search of a theoretical foundation. D. H. Sonnenwald (Yay. haz.), Theory Development in the Information Sciences içinde (s. 141-163). University of Texas Press. https://doi.org/10.7560/308240-011
Sperber, D. ve Wilson, D. (1995). Relevance: Communication and cognition. Blackwell. https://monoskop.org/images/e/e6/Sperber_Dan_Wilson_Deirdre_Relevance _Communica_and_Cognition_2nd_edition_1996.pdf
Swanson, D. R. (1986a). Subjective versus objective relevance in bibliographic retrieval systems. The Library Quarterly, 56(4), 389-398. https://doi.org/10.1086/601800
Swanson, D. R. (1986b). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1):7-18. https://doi.org/10.1353/pbm.1986.0087
Thara, D. K., PremaSudha, B. G. ve Xiong, F. (2019). Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recognition Letters, 128, 544-550. https://doi.org/10.1016/j.patrec.2019.10.029
Thompson, P. (2007). Looking back: On relevance, probabilistic indexing and information retrieval. Information Processing & Management, 44(2), 963-970. https://doi.org/10.1016/j.ipm.2007.10.002
Tonta, Y. (1995). Bilgi erişim sistemleri. Türk Kütüphaneciliği, 9(3), 302-314. https://eprints.rclis.org/9571/
Tonta, Y. ve Akbulut, M. (2021). Uluslararası dergilerde yayımlanan Türkiye adresli makalelerin atıf etkisini artıran faktörler. Türk Kütüphaneciliği, 35(3), 388-409. https://doi.org/10.24146/tk.933159
Vergoulis, T., Chatzopoulos, S., Kanellos, I., Deligiannis, P., Tryfonopoulos, C. ve Dalamagas, T. (2019). BIP! finder: Facilitating scientific literature search by exploiting impact-based ranking. Proceedings of the 28th ACM International Conference on Information and Knowledge Management içinde (s. 2937-2940). ACM. https://doi.org/10.1145/3357384.3357850
Verma, M., Yılmaz, E. ve Craswell, N. (2016). On obtaining effort based judgements for information retrieval. Proceedings of the 9th ACM International Conference on Web Search and Data Mining içinde (s. 277-286). ACM. https://doi.org/10.1145/2835776.2835840
Wang, X., Zhai, C. ve Roth, D. (2013). Understanding evolution of research themes: a probabilistic generative model for citations. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1115-1123). ACM. https://doi.org/10.1145/2487575.2487698
White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory. Part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 58, 536-559. https://doi.org/10.1002/asi.20543
White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory. Part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 58, 583-605. https://doi.org/10.1002/asi.20542
White, H. D. (2009). Pennants for Strindberg and Persson. Celebrating scholarly communication studies: A festschrift for Olle Persson at his 60th birthday. Special volume of the E-newsletter of the International Society for Scientometrics and Informetrics, 5, 71-83. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.2055&rep=rep1&type=pdf#page=73
White, H. D. (2010). Some new tests of relevance theory in information science. Scientometrics, 83, 653-667. https://doi.org/10.1007/s11192-009-0138-3
White, H. D. (2015). Co-cited author retrieval and relevance theory: examples from the humanities. Scientometrics, 102(3), 2275-2299. https://doi.org/10.1007/s11192-014-1483-4
White, H. D. (2016). Bag of works retrieval: TF*IDF weighting of co-cited works. Proceedings of the 3rd Workshop on Bibliometric-Enhanced Information Retrieval (BIR2016) içinde (s. 63-72). https://ceur-ws.org/Vol-1567/paper7.pdf
White, H. D. (2018). Bag of works retrieval: TF*IDF weighting of co-cited works with a seed. International Journal of Digital Libraries, 19, 139-149. https://doi.org/10.1007/s00799-017-0217-7
White, H. D. ve McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4): 327-355. https://doi.org/10.1002/(SICI)1097-4571(19980401)49:4%3C327::AID-ASI4%3E3.0.CO;2-4
Wilson, P. (1978). Some fundamental concepts of information retrieval. Drexel Library Quarterly, 14(2), 10-24.
Wilson, D. ve Sperber, D. (2002). Relevance theory. G. Ward ve L. Horn (Yay. haz.) Handbook of pragmatics içinde (s. 1-55). Blackwell. https://jeannicod.ccsd.cnrs.fr/ijn_00000101/document
Wu, H. C., Luk, R. W., Wong, K. F. ve Kwok, K. L. (2007). A retrospective study of a hybrid document-context based retrieval model. Information Processing & Management, 43(5), 1308-1331. https://doi.org/10.1016/j.ipm.2006.10.009
Wu, J., Son, G. ve Wang, S. (2020). A competency mining method based on Latent Dirichlet Allocation (LDA) model. Journal of Physics: Conference Series (Vol. 1682, No. 1, p. 012059) içinde. IOP Publishing. https://iopscience.iop.org/article/10.1088/1742-6596/1682/1/012059/meta
Xia, H., Li, J., Tang, J. ve Moens MF. (2012). Plink-LDA: Using link as prior information in topic modeling. S. Lee, Z. Peng, X. Zhou, Y. S. Moon, R. Unland ve J. Yoo (Yay. haz.) Database Systems for Advanced Applications içinde (s. 213-227). Springer. https://doi.org/10.1007/978-3-642-29038-1_17
Xie, X., Liang, Y., Li, X. ve Tan, W. (2019). CuLDA_CGS: Solving large-scale LDA problems on GPUs. Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming içinde (s. 435-436). ACM. https://doi.org/10.1145/3293883.3301496
Yang, H. T., Ju, J. H., Wong, Y. T., Shmulevich, I. ve Chiang, J. H. (2017). Literature-based discovery of new candidates for drug repurposing. Briefings in Bioinformatics, 18(3), 488-497. https://doi.org/10.1093/bib/bbw030
Yang, L., Ji, D. ve Leong, M. (2007). Document reranking by term distribution and maximal marginal relevance for Chinese information retrieval. Information Processing & Management, 43(2), 315-326. https://doi.org/10.1016/j.ipm.2006.07.011
Yılmaz, E., Verma, M., Craswell, N., Radlinski, F. ve Bailey, P. (2014). Relevance and effort: An analysis of document utility. Proceedings of the 23rd ACM International Conference on Information and Knowledge Management içinde (s. 91-100). ACM. https://doi.org/10.1145/2661829.2661953
Zarrinkalam, F. ve Kahani, M. (2012). A new metric for measuring relatedness of scientific papers based on non-textual features. Intelligent Information Management, 4(4), 99-107. https://www.scirp.org/pdf/IIM20120400001_98298896.pdf
Zhou, H. K., Yu, H. M. ve Hu, R. (2017). Topic discovery and evolution in scientific literature based on content and citations. Frontiers of Information Technology & Electronic Engineering, 18(10), 1511-1524. https://doi.org/10.1631/FITEE.1601125
Zou, L., Liu, X., Buntine, W. ve Liu, Y. (2021). Citation context-based topic models: discovering cited and citing topics from full text. Library Hi Tech, 39(4), 1063-1083. https://doi.org/10.1108/LHT-01-2021-0041

Incremental Refinement of Relevance Rankings: Introducing a New Method Supported with Pennant Retrieval

Yıl 2022, Cilt: 36 Sayı: 2, 169 - 203, 30.06.2022

Müge Akbulut , Yaşar Tonta

https://doi.org/10.24146/tk.1062751

Öz

Purpose: Relevance ranking algorithms rank retrieved documents based on the degrees of topical similarity (relevance) between search queries and documents. This paper aims to introduce a new relevance ranking method combining a probabilistic topic modeling algorithm with the “pennant retrieval” method using citation data. Data and Method: We applied this method to the iSearch corpus consisting of c. 435,000 physics papers. We first ran the topic modeling algorithm on titles and summaries of all papers for 65 search queries and obtained the relevance ranking lists. We then used the pennant retrieval to fuse the citation data with the existing relevance rankings, thereby incrementally refining the results. The outcome produced better relevance rankings with papers covering various aspects of the topic searched as well as the more marginal ones. The Maximal Marginal Relevance (MMR) algorithm was used to evaluate the retrieval performance of the proposed method by finding out its effect on relevance ranking algorithms that we used. Findings: Findings suggest that the terms used in different contexts in the papers might sometimes be overlooked by the topic modeling algorithm. Yet, the fusion of citation data to relevance ranking lists provides additional contextual information, thereby further enriching the results with diverse (interdisciplinary) papers of higher relevance. Moreover, results can easily be re-ranked and personalized. Implications: We argue that once it is tested on dynamic corpora for computational load, robustness, replicability, and scalability, the proposed method can in time be used in both local and international information systems such as TR-Dizin, Web of Science, and Scopus. Originality: The proposed method is, as far as we know, the first one that shows that relevance rankings produced with a topic modeling algorithm can be incrementally refined using pennant retrieval techniques based on citation data.

Anahtar Kelimeler

Relevance rankings, probabilistic topic modeling, pennant retrieval, Maximal Marginal Relevance (MMR), Latent Dirichlet Allocation (LDA) algorithm

Kaynakça

Abramo, G., D’Angelo, C. A. ve Zhang, L. (2018). A comparison of two approaches for measuring interdisciplinary research output: The disciplinary diversity of authors vs the disciplinary diversity of the reference list. Journal of Informetrics, 12(4), 1182-1193. https://doi.org/10.1016/j.joi.2018.09.001
Adomavicius, G. ve Kwon, Y. (2011). Improving aggregate recommendation diversity using ranking-based techniques. IEEE Transactions on Knowledge and Data Engineering, 24(5), 896-911. https://doi.org/10.1109/TKDE.2011.15
ADS Team (2008). SAO/NASA ADS Abstract Service Stopword List. https://adsabs.harvard.edu/abs_doc/stopwords.html
Akbulut, M. (2016). Atıf klasiklerinin etkisinin ve ilgililik sıralamalarının pennant diyagramları ile analizi [Yayımlanmamış yüksek lisans tezi]. Hacettepe Üniversitesi. https://hdl.handle.net/11655/3529
Akbulut, M., Tonta, Y. ve White, H. D. (2020). Related records retrieval and pennant retrieval: An exploratory case study. Scientometrics, 122(2), 957-987. https://doi.org/10.1007/s11192-019-03303-9
Arun, R., Suresh, V., Madhavan, C. V. ve Murthy, M. N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. Pacific-Asia Conference on Knowledge Discovery and Data Mining içinde (s. 391-402). Springer. https://doi.org/10.1007/978-3-642-13657-3_43
Baeza-Yates, R. ve Ribeiro-Neto, B. (1999). Modern information retrieval. ACM Press.
Ballester, O. ve Penner, O. (2022). Robustness, replicability and scalability in topic modelling. Journal of Informetrics, 16(1). https://doi.org/10.1016/j.joi.2021.101224
Bayer, D. ve Michael, S. (2019). Exploring the daschle collection using text mining. arXiv. https://arxiv.org/pdf/1904.12623.pdf
Beel, J. ve Gipp, B. (2009). Google Scholar’s ranking algorithm: An introductory overview. B. Larsen ve J. Leta (Yay. haz.). Proceedings of the 12th International Conference on Scientometrics and Informetrics içinde (s. 230-241). International Society for Scientometrics and Informetrics. https://www.issi-society.org/proceedings/issi_2009/ISSI2009-proc-vol1_Aug2009_batch2-paper-1.pdf
Beel, J., Gipp, B., Langer, S. ve Breitinger, C. (2016). Research-paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4): 305-338.
Belter, C. W. (2017). A relevance ranking method for citation-based search results. Scientometrics, 112(2), 731-746. https://doi.org/10.1007/s11192-017-2406-y
Bichteler, J. ve Eaton III, E.A. (1980). The combined use of bibliographic coupling and cocitation for document retrieval. Journal of the American Society for Information Science, 31(4): 278–282.
Blei, D. M. ve Lafferty, J. D. (2009). Topic models. A. Srivastava ve M. Sahami (Yay. haz.). Text Mining: Classification, Clustering and Applications içinde (s. 71-94). CRC Press, Taylor & Francis. http://www.cs.columbia.edu/~blei/papers/BleiLafferty2009.pdf
Blei, D. M., Ng, A. Y. ve Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993-1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf?TB_iframe=true&width=370.8&height=658.8
Bornmann, L., Haunschild, R. ve Mutz, R. (2021). Growth rates of modern science: A latent piecewise growth curve approach to model publication numbers from established and new literature databases. Humanities and Social Sciences Communications, 8(1), 1-15. https://doi.org/10.1057/s41599-021-00903-w
Boyd-Graber, J. ve Blei, D. M. (2010). Syntactic topic models. arXiv. https://arxiv.org/pdf/1002.4665.pdf
Bradley, K. ve Smyth, B. (2001). Improving recommendation diversity. D. O'Donoghue (Yay. haz.) Proceedings of the Twelfth Irish Conference on Artificial Intelligence and Cognitive Science içinde (s. 141-152). NUIM Department of Computer Science. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.8.5232&rep=rep1&type=pdf
Cambria, E. ve White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48-57. https://doi.org/10.1109/MCI.2014.2307227
Cao, J., Xia, T., Li, J., Zhang, Y. ve Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7-9), 1775-1781. https://doi.org/10.1016/j.neucom.2008.06.011
Carbonell, J. ve Goldstein, J. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval içinde (s. 335-336). Association for Computing Machinery. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.188.3982&rep=rep1&type=pdf
Carevic, Z. ve Mayr, P. (2014). Recommender systems using pennant diagrams in digital libraries. arXiv. https://arxiv.org/pdf/1407.7276v1.pdf
Carevic, Z. ve Schaer, P. (2014). On the connection between citation-based and topical relevance ranking: results of a pretest using iSearch. Proceedings of the First Workshop on Bibliometric-enhanced Information Retrieval co-located with 36th European Conference on Information Retrieval (ECIR 2014) içinde (s. 37-44). Springer-Verlag. https://ceur-ws.org/Vol-1143/paper5.pdf
Carroll, M. (2018). Changes in media coverage of GCSEs from 1988 to 2017. Cambridge. https://www.cambridgeassessment.org.uk/Images/504456-changes-in-media-coverage-of-gcses-from-1988-to-2017.pdf
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L. ve Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems içinde (s. 288-296). MIT Press. https://proceedings.neurips.cc/paper/2009/file/f92586a25bb3145facd64ab20fd554ff-Paper.pdf
Chen, M. ve Décary, M. (2018). A Cognitive-based semantic approach to deep content analysis in search engines. 2018 IEEE 12th International Conference on Semantic Computing (ICSC) içinde (s. 131-139). IEEE. https://doi.ieeecomputersociety.org/10.1109/ICSC.2018.00027
Chen, Z. ve Liu, B. (2014). Mining topics in documents: Standing on the shoulders of big data. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1116-1125). ACM. https://dl.acm.org/doi/pdf/10.1145/2623330.2623622
Cooper, W. S. (1988). Getting beyond boole. Information Processing & Management, 24(3), 243-248. https://doi.org/10.1016/0306-4573(88)90091-X
Croft W. B. (2002). Combining approaches to information retrieval. W.B. Croft (Yay. haz.). Advances in Information Retrieval. The Information Retrieval Series, vol 7. içinde (s. 1-35). Springer, https://doi.org/10.1007/0-306-47019-5_1
Danilov, M. (2005). Experimental review on pentaquarks. arXiv. https://arxiv.org/abs/hep-ex/0509012
Deveaud, R., SanJuan, E. ve Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document Numérique, 17(1), 61-84. https://doi.org/10.3166/dn.17.1.61-84
Ekinci, E. ve İlhan Omurca, S. (2020). Concept-LDA: Incorporating Babelfy into LDA for aspect extraction. Journal of Information Science, 46(3), 406-418. https://doi.org/10.1177/0165551519845854
Ganguly, D. ve Jones, G. J. (2018). A non-parametric topical relevance model. Information Retrieval Journal, 21(5), 449-479. https://doi.org/10.1007/s10791-018-9329-y
Giustolisi, O., Ridolfi, L. ve Simone, A. (2020). Embedding the intrinsic relevance of vertices in network analysis: the case of centrality metrics. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-60151-x
Gläser, J., Glänzel, W. ve Scharnhorst, A. (2017). Same data—different results? Towards a comparative approach to the identification of thematic structures in science. Scientometrics, 111(2), 981-998. https://doi.org/10.1007/s11192-017-2296-z
Griffiths, T. L. ve Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(1), 5228-5235. https://doi.org/10.1073/pnas.0307752101
Guillemette, J., Simms, B., Zhou, D. ve Mills, S. (2017). Applying latent dirichlet allocation to yelp reviews. https://people.math.carleton.ca/~smills/2017-18/STAT4601-5703/Research%20Projects/2018%20Submissions/GuillemetteSimmsZhouD/Applying%20LDA.pdf
Guo, J., Fan, Y., Ai, Q. ve Croft, W. B. (2016). A deep relevance matching model for ad-hoc retrieval. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management içinde (s. 55-64). ACM. https://doi.org/10.1145/2983323.2983769
Guo, Z., Zhang, Z. M., Zhu, S., Chi, Y. ve Gong, Y. (2013). A two-level topic model towards knowledge discovery from citation networks. IEEE Transactions on Knowledge and Data Engineering, 26(4), 780-794. https://doi.org/10.1109/TKDE.2013.56
Han, X. (2020). Evolution of research topics in LIS between 1996 and 2019: an analysis based on latent dirichlet allocation topic model. Scientometrics, 125(3), 2561-2595. https://doi.org/10.1007/s11192-020-03721-0
Hecking, T. ve Leydesdorff, L. (2018). Topic modelling of empirical text corpora: Validity, reliability, and reproducibility in comparison to semantic maps. arXiv. https://arxiv.org/pdf/1806.01045.pdf
Herlocker, J.L., Konstan, J. A., Terveen, L. G. ve Riedl, J. T. (2004) Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5-53. https://doi.org/10.1145/963770.963772
Holliger, T. S. (2018). Strategic sourcing via category management: Helping air force installation contracting agency eat one piece of the elephant [Yayımlanmamış yüksek lisans tezi]. Air Force Institute of Technology. https://apps.dtic.mil/sti/pdfs/AD1056353.pdf
Huang, L., Liu, H., He, J. ve Du, X. (2016). Finding latest influential research papers through modeling two views of citation links. F. Li, K. Shim, K. Zheng ve G. Liu (Yay. haz.) Web Technologies and Applications APWeb 2016 içinde (s. 555-566). Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_45
Huang, X., Chen, C., Peng, C., Wu, X., Fu, L. ve Wang, X. (2018). Topic-sensitive influential paper discovery in citation network. D. Phung, V. Tseng, G. Webb, B. Ho, M. Ganji ve L. Rashidi (Yay. haz.). Advances in Knowledge Discovery and Data Mining içinde (s. 16-28). Springer, Cham. https://doi.org/10.1007/978-3-319-93037-4_2
Jin, R., Valizadegan, H. ve Li, H. (2008). Ranking refinement and its application to information retrieval. Proceedings of the 17th International Conference on World Wide Web içinde (s. 397-406). ACM. http://doi.org/10.1145/1367497.1367552
Ke, Q., Ferrara, E., Radicchi, F. ve Flammini, A. (2015). Defining and identifying Sleeping Beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7426-7431. https://doi.org/10.1073/pnas.1424329112
Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14(1): 10-25
Knoth, P., Anastasiou, L., Charalampous, A., Cancellieri, M., Pearce, S., Pontika, N. ve Bayer, V. (2017). Towards effective research recommender systems for repositories. arXiv. https://arxiv.org/abs/1705.00578
Kucuktunc, O. ve Ferhatosmanoglu, H. (2011). λ-diverse nearest neighbors browsing for multidimensional data. IEEE Transactions on Knowledge and Data Engineering, 25(3), 481-493. https://doi.org/10.1109/TKDE.2011.251
Küçüktunç, O., Saule, E., Kaya, K. ve Çatalyürek, Ü. V. (2015). Diversifying citation recommendations. ACM Transactions on Intelligent Systems and Technology, 5(4), 1-21. https://doi.org/10.1145/2668106
Lei, M., Wang, J., Chen, B. ve Li, X. (2001). Improved relevance ranking in WebGather. Journal of Computer Science and Technology, 16(5), 410-417. https://doi.org/10.1007/bf02948958
Leydesdorff, L. ve Nerghes, A. (2017). Co‐word maps and topic modeling: A comparison using small and medium‐sized corpora (N< 1,000). Journal of the Association for Information Science and Technology, 68(4), 1024-1035. https://doi.org/10.1002/asi.23740
Li, C., Feng, H. ve Rijke, M. D. (2020). Cascading hybrid bandits: online learning to rank for relevance and diversity. Fourteenth ACM Conference on Recommender Systems içinde (s. 33-42). ACM. https://doi.org/10.1145/3383313.3412245
Li, W. ve McCallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. Proceedings of the 23rd International Conference on Machine Learning içinde (s. 577-584). Springer. https://doi.org/10.1145/1143844.1143917
Li, Y., He, J. ve Liu, H. (2017). Topic analysis and influential paper discovery on scientific publications. 2017 14th Web Information Systems and Applications Conference (WISA) içinde (s. 68-73). IEEE. https://doi.org/10.1109/WISA.2017.69
Liu, X., Wang, G. ve Zakirul Alam Bhuiyan, M. (2022). Re‐ranking with multiple objective optimization in recommender system. Transactions on Emerging Telecommunications Technologies, 33(1): e4398 https://doi.org/10.1002/ett.4398
Liu, X. Z. ve Fang, H. (2020). A comparison among citation-based journal indicators and their relative changes with time. Journal of Informetrics, 14(1), 1-17. https://doi.org/10.1016/j.joi.2020.101007
Lykke, M., Larsen, B., Lund, H. ve Ingwersen, P. (2010). Developing a test collection for the evaluation of integrated search. European Conference on Information Retrieval içinde (s. 627-630). Springer. https://doi.org/10.1007/978-3-642-12275-0_63
Ma, Z., Liu, Y., Yang, Z., Yang, J. ve Li, K. (2022). A parameter-free approach to lossless summarization of fully dynamic graphs. Information Sciences, 589, 376-394. https://doi.org/10.1016/j.ins.2021.12.116
Mahajan, M., Beeferman, D. ve Huang, X. D. (1999). Improved topic-dependent language modeling using information retrieval techniques. 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings içinde (s. 541-544). IEEE. https://doi.org/10.1109/ICASSP.1999.758182
Manning, C. ve Schütze, H. (2000). Foundations of statistical natural language processing. MIT Press. https://ics.upjs.sk/~pero/web/documents/pillar/Manning_Schuetze_Statistical NLP.pdf
Maron, M. E. ve Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM, 7(3), 216-244. https://doi.org/10.1145/321033.321035
Marujo, L., Ribeiro, R., Gershman, A., De Matos, D.M., Neto, J.P. ve Carbonell, J. (2017). Event-based summarization using a centrality-as-relevance model. Knowledge and Information Systems, 50, 945–968. https://doi.org/10.1007/s10115-016-0966-4
Mayr, P. ve Mutschke, P. (2013). Bibliometric-enhanced retrieval models for big scholarly information systems. 2013 IEEE International Conference on Big Data içinde (s. 5-8). IEEE. https://doi.org/10.1109/BigData.2013.6691762
McNee, S. M., Riedl, J. ve Konstan, J. A. (2006). Being accurate is not enough: how accuracy metrics have hurt recommender systems. CHI'06 extended abstracts on human factors in computing systems içinde (s. 1097-1101). https://doi.org/10.1145/1125451.1125659
Meng, W., Yu, C. ve Liu, K. L. (2002). Building efficient and effective metasearch engines. ACM Computing Surveys (CSUR), 34(1), 48-89. https://doi.org/10.1145/505282.505284
Mizzaro, S. (1997). Relevance: The whole history. Journal of the American Society for Information Science, 48, 810-832. https://doi.org/10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO;2-U
Nguyen, T. ve Do, P. (2018). CitationLDA++ an extension of LDA for discovering topics in document network. Proceedings of the Ninth International Symposium on Information and Communication Technology içinde (s. 31-37). ACM. https://doi.org/10.1145/3287921.3287930
Nikita, M. (2020, 20 Nisan). Select number of topics for LDA. https://cran.r-project.org/web/packages/ldatuning/vignettes/topics.html
Nolasco, D. ve Oliveira, J. (2016). Detecting knowledge innovation through automatic topic labeling on scholar data. 2016 49th Hawaii International Conference on System Sciences (HICSS) içinde (s. 358-367). IEEE. https://doi.org/10.1109/HICSS.2016.51
Pao, M. L. (1993). Term and citation retrieval: A field study. Information Processing & Management. 29(1), 95-112. https://doi.org/10.1016/0306-4573(93)90026-A
Ponweiser, M. (2012). Latent dirichlet allocation in R. [Yayımlanmamış yüksek lisans tezi]. Viyana Üniversitesi. https://epub.wu.ac.at/id/eprint/3558
Rafols, I., Leydesdorff, L., O’Hare, A., Nightingale, P. ve Stirling, A. (2012). How journal rankings can suppress interdisciplinary research: A comparison between innovation studies and business & management. Research Policy, 41(7), 1262-1282. https://doi.org/10.1016/j.respol.2012.03.015
Ribeiro, R., ve de Matos, D.M. (2011). Revisiting Centrality-as-relevance: support sets and similarity as geometric proximity. Journal of Artificial Intelligence Research, 42, 275-308. https://doi.org/10.1613/jair.3387
Robertson, S. E. (1977). The probability ranking principle in IR. Journal of Documentation, 33(4), 294-304. https://doi.org/10.1108/eb026647
Rüdiger, M. S., Antons, D. ve Salge, T. O. (2021). The explanatory power of citations: a new approach to unpacking impact in science. Scientometrics, 126, 9779-9809. https://doi.org/10.1007/s11192-021-04103-w
Salton, G., Yang, C. ve Wong, A. (1975). A vector space model for automatic indexing. Communications of the ACM, 18, 613-620. https://doi.org/10.1145/361219.361220
Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141-156. https://doi.org/10.1016/j.esp.2002.10.001
Saracevic, T. (2021). Relevance: In search of a theoretical foundation. D. H. Sonnenwald (Yay. haz.), Theory Development in the Information Sciences içinde (s. 141-163). University of Texas Press. https://doi.org/10.7560/308240-011
Sperber, D. ve Wilson, D. (1995). Relevance: Communication and cognition. Blackwell. https://monoskop.org/images/e/e6/Sperber_Dan_Wilson_Deirdre_Relevance _Communica_and_Cognition_2nd_edition_1996.pdf
Swanson, D. R. (1986a). Subjective versus objective relevance in bibliographic retrieval systems. The Library Quarterly, 56(4), 389-398. https://doi.org/10.1086/601800
Swanson, D. R. (1986b). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1):7-18. https://doi.org/10.1353/pbm.1986.0087
Thara, D. K., PremaSudha, B. G. ve Xiong, F. (2019). Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recognition Letters, 128, 544-550. https://doi.org/10.1016/j.patrec.2019.10.029
Thompson, P. (2007). Looking back: On relevance, probabilistic indexing and information retrieval. Information Processing & Management, 44(2), 963-970. https://doi.org/10.1016/j.ipm.2007.10.002
Tonta, Y. (1995). Bilgi erişim sistemleri. Türk Kütüphaneciliği, 9(3), 302-314. https://eprints.rclis.org/9571/
Tonta, Y. ve Akbulut, M. (2021). Uluslararası dergilerde yayımlanan Türkiye adresli makalelerin atıf etkisini artıran faktörler. Türk Kütüphaneciliği, 35(3), 388-409. https://doi.org/10.24146/tk.933159
Vergoulis, T., Chatzopoulos, S., Kanellos, I., Deligiannis, P., Tryfonopoulos, C. ve Dalamagas, T. (2019). BIP! finder: Facilitating scientific literature search by exploiting impact-based ranking. Proceedings of the 28th ACM International Conference on Information and Knowledge Management içinde (s. 2937-2940). ACM. https://doi.org/10.1145/3357384.3357850
Verma, M., Yılmaz, E. ve Craswell, N. (2016). On obtaining effort based judgements for information retrieval. Proceedings of the 9th ACM International Conference on Web Search and Data Mining içinde (s. 277-286). ACM. https://doi.org/10.1145/2835776.2835840
Wang, X., Zhai, C. ve Roth, D. (2013). Understanding evolution of research themes: a probabilistic generative model for citations. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining içinde (s. 1115-1123). ACM. https://doi.org/10.1145/2487575.2487698
White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory. Part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 58, 536-559. https://doi.org/10.1002/asi.20543
White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory. Part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 58, 583-605. https://doi.org/10.1002/asi.20542
White, H. D. (2009). Pennants for Strindberg and Persson. Celebrating scholarly communication studies: A festschrift for Olle Persson at his 60th birthday. Special volume of the E-newsletter of the International Society for Scientometrics and Informetrics, 5, 71-83. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.2055&rep=rep1&type=pdf#page=73
White, H. D. (2010). Some new tests of relevance theory in information science. Scientometrics, 83, 653-667. https://doi.org/10.1007/s11192-009-0138-3
White, H. D. (2015). Co-cited author retrieval and relevance theory: examples from the humanities. Scientometrics, 102(3), 2275-2299. https://doi.org/10.1007/s11192-014-1483-4
White, H. D. (2016). Bag of works retrieval: TF*IDF weighting of co-cited works. Proceedings of the 3rd Workshop on Bibliometric-Enhanced Information Retrieval (BIR2016) içinde (s. 63-72). https://ceur-ws.org/Vol-1567/paper7.pdf
White, H. D. (2018). Bag of works retrieval: TF*IDF weighting of co-cited works with a seed. International Journal of Digital Libraries, 19, 139-149. https://doi.org/10.1007/s00799-017-0217-7
White, H. D. ve McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4): 327-355. https://doi.org/10.1002/(SICI)1097-4571(19980401)49:4%3C327::AID-ASI4%3E3.0.CO;2-4
Wilson, P. (1978). Some fundamental concepts of information retrieval. Drexel Library Quarterly, 14(2), 10-24.
Wilson, D. ve Sperber, D. (2002). Relevance theory. G. Ward ve L. Horn (Yay. haz.) Handbook of pragmatics içinde (s. 1-55). Blackwell. https://jeannicod.ccsd.cnrs.fr/ijn_00000101/document
Wu, H. C., Luk, R. W., Wong, K. F. ve Kwok, K. L. (2007). A retrospective study of a hybrid document-context based retrieval model. Information Processing & Management, 43(5), 1308-1331. https://doi.org/10.1016/j.ipm.2006.10.009
Wu, J., Son, G. ve Wang, S. (2020). A competency mining method based on Latent Dirichlet Allocation (LDA) model. Journal of Physics: Conference Series (Vol. 1682, No. 1, p. 012059) içinde. IOP Publishing. https://iopscience.iop.org/article/10.1088/1742-6596/1682/1/012059/meta
Xia, H., Li, J., Tang, J. ve Moens MF. (2012). Plink-LDA: Using link as prior information in topic modeling. S. Lee, Z. Peng, X. Zhou, Y. S. Moon, R. Unland ve J. Yoo (Yay. haz.) Database Systems for Advanced Applications içinde (s. 213-227). Springer. https://doi.org/10.1007/978-3-642-29038-1_17
Xie, X., Liang, Y., Li, X. ve Tan, W. (2019). CuLDA_CGS: Solving large-scale LDA problems on GPUs. Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming içinde (s. 435-436). ACM. https://doi.org/10.1145/3293883.3301496
Yang, H. T., Ju, J. H., Wong, Y. T., Shmulevich, I. ve Chiang, J. H. (2017). Literature-based discovery of new candidates for drug repurposing. Briefings in Bioinformatics, 18(3), 488-497. https://doi.org/10.1093/bib/bbw030
Yang, L., Ji, D. ve Leong, M. (2007). Document reranking by term distribution and maximal marginal relevance for Chinese information retrieval. Information Processing & Management, 43(2), 315-326. https://doi.org/10.1016/j.ipm.2006.07.011
Yılmaz, E., Verma, M., Craswell, N., Radlinski, F. ve Bailey, P. (2014). Relevance and effort: An analysis of document utility. Proceedings of the 23rd ACM International Conference on Information and Knowledge Management içinde (s. 91-100). ACM. https://doi.org/10.1145/2661829.2661953
Zarrinkalam, F. ve Kahani, M. (2012). A new metric for measuring relatedness of scientific papers based on non-textual features. Intelligent Information Management, 4(4), 99-107. https://www.scirp.org/pdf/IIM20120400001_98298896.pdf
Zhou, H. K., Yu, H. M. ve Hu, R. (2017). Topic discovery and evolution in scientific literature based on content and citations. Frontiers of Information Technology & Electronic Engineering, 18(10), 1511-1524. https://doi.org/10.1631/FITEE.1601125
Zou, L., Liu, X., Buntine, W. ve Liu, Y. (2021). Citation context-based topic models: discovering cited and citing topics from full text. Library Hi Tech, 39(4), 1063-1083. https://doi.org/10.1108/LHT-01-2021-0041

Toplam 110 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Kütüphane ve Bilgi Çalışmaları
Bölüm	Araştırma Makaleleri
Yazarlar	Müge Akbulut 0000-0003-0026-6485 Yaşar Tonta 0000-0003-0285-1338
Yayımlanma Tarihi	30 Haziran 2022
Gönderilme Tarihi	25 Ocak 2022
Kabul Tarihi	10 Nisan 2022
Yayımlandığı Sayı	Yıl 2022 Cilt: 36 Sayı: 2

Kaynak Göster

APA	Akbulut, M., & Tonta, Y. (2022). İlgi Sıralamalarının Artırımlı Olarak Geliştirilmesi: Pennant Erişimle Desteklenen Yeni Bir Yöntem Önerisi. Türk Kütüphaneciliği, 36(2), 169-203. https://doi.org/10.24146/tk.1062751

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

Bu dergi içeriği CC BY 4.0 cc.svg?ref=chooser-v1 by.svg?ref=chooser-v1 ile lisanslanmaktadır.