Araştırma Makalesi
BibTex RIS Kaynak Göster

Performance Comparison of TOR Hidden Service Crawlers

Yıl 2019, Cilt: 6 Sayı: 2, 147 - 161, 26.12.2019
https://doi.org/10.35193/bseufbd.608555

Öz

TOR (The Onion Routing) is a network structure that has become popular in recent years due to providing anonymity to its users and is often preferred by hidden services. In this network, which attracts attention due to the fact that privacy is essential, so the amount of data stored increases day by day, making it difficult to scan and analyze the data. In addition, it is highly likely that the process performed during the onion extension services scan will be considered as cyber-attack and the access to the relevant address will be blocked. Various crawler software has been developed in order to scan and access the services (onion web pages) in this network. However, crawling here is different from crawling pages in a surface network with extensions such as com, net, org. This is because the TOR network is located on the lower layers of the surface network, and the pages in TOR network are accessed only through the TOR browser instead of the traditional browsers (Chrome, Mozilla, etc.). In the crawler softwares developed to date, this situation was taken into consideration and in order to protect the confidentiality, the data was obtained by selecting paths through different relays in the requests made to the addresses.




In the TOR network, reaching the target address by passing over different nodes in each request sent by the users,  slows down this network. In addition, the low performance of a browser that tries to retrieve information through TOR brings long periods of waiting. Therefore, working with crawler software with high crawling and information acquisition speed will improve the analysis process of the researchers. 4 different crawler software was evaluated according to various criteria in terms of guiding the people who will conduct research in this field and evaluating the superior and weaknesses of the crawlers against each other. The study provides an important point of view for choosing the right crawler in terms of initial starting points for the researchers want to analyze of Tor web services.




Kaynakça

  • [1] AlKhatib, B., Basheer, R. (2019). Crawling the Dark Web: A Conceptual Perspective, Challenges and Implementation. Journal of Digital Information Management, 17(2), 51-60.
  • [2] Hoelscher, P. (2018). What is the Difference Between the Surface Web, the Deep Web, and the Dark Web? Infosec Resources, https://resources.infosecinstitute.com/what-is-the-difference-between-the-surface-web-the-deep-web-and-the-dark-web/#gref, (01.12.2019).
  • [3] Zabihimayvan, M., Sadeghi, R., Doran, D., Allahyari, M. (2019). A Broad Evaluation of the Tor English Content Ecosystem. In Proceedings on WebSci 2019, June 30–July 3, Boston, Massachusetts, 333-342.
  • [4] Owen, G., Savage, N. (2016). Empirical analysis of Tor Hidden Services. IET Information Security, 10(3), 113-118.
  • [5] Park, J., Mun, H., Lee, Y. (2018). Improving Tor Hidden Service Crawler Performance. In 2018 IEEE Conference on Dependable and Secure Computing (DSC), 10-13 December, Kaohsiung, Taiwan, 1-8.
  • [6] Casenove, M., Miraglia, A. (2014). Botnet over Tor: The illusion of hiding. In 2014 6th International Conference On Cyber Conflict, 3-6 June, Tallinn, Estonia, 273-282.
  • [7] Pundhir, S., Rafiq, M. Q. (2011). Performance Evaluation of Web Crawler. In IJCA Proceedings on International Conference on Emerging Technology Trends (ICETT), 43-46.
  • [8] Achsan, H. T. Y., Wibowo, W. C. (2014). A Fast Distributed Focused-Web Crawling. Procedia Engineering, 69, 492-499.
  • [9] Yadav, M., Goyal, N. (2015). Comparison of Open Source Crawlers-A Review. International Journal of Scientific & Engineering Research, 6, 1544-1551.
  • [10] Dikaiakos, M., Stassopoulou, A., Papageorgiou, L. (2003). Characterizing Crawler Behavior from Web Server Access Logs. In E-Commerce and Web Technologies, 2-5 September, Prague, 369-378.
  • [11] Zulkarnine, A. T., Frank, R., Monk, B., Mitchell, J., Davies, G. (2016). Surfacing collaborated networks in dark web to find illicit and criminal content. In 2016 IEEE Conference on Intelligence and Security Informatics (ISI), 28-30 September, Tucson, AZ, 109-114.
  • [12] Baravalle, A., Lopez, M. S., Lee, S. W. (2016). Mining the Dark Web: Drugs and Fake Ids. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), 12-15 December, Barcelona, 350-356.
  • [13] Kalpakis, G., Tsikrika, T., Iliou, C., Mironidis, T., Vrochidis, S., Middleton, J., Williamson, U., Kompatsiaris, I. (2016). Interactive Discovery and Retrieval of Web Resources Containing Home Made Explosive Recipes. In International Conference on Human Aspects of Information Security, Privacy, and Trust, 17 - 22 July, Toronto, 221-233.
  • [14] Iliou, C., Kalpakis, G., Tsikrika, T., Vrochidis, S., Kompatsiaris, I. (2016). Hybrid Focused Crawling for Homemade Explosives Discovery on Surface and Dark Web. In 2016 11th International Conference on Availability, Reliability and Security (ARES), 31 August-2 September, Salzburg, 229–234.
  • [15] Zhang, Y., Zeng, S., Huang, C., Fan, L., Yu, X., Dang, Y., A Larson, C., Denning, Roberts, N., Chen, H. (2010). Developing a Dark Web collection and infrastructure for computational and social sciences. In 2010 IEEE International Conference on Intelligence and Security Informatics, 23-26 May, Vancouver, BC, 59–64.
  • [16] Ghosh, S., Das, A., Porras, P., Yegneswaran, V., Gehani, A. (2017). Automated Categorization of Onion Sites for Analyzing the Darkweb Ecosystem. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2017, Halifax, Nova Scotia, 1793–1802.
  • [17] Pannu, M., Kay, I., Harris, D. (2018). Using Dark Web Crawler to Uncover Suspicious and Malicious Websites. In International Conference on Applied Human Factors and Ergonomics, 21-25 July, Orlando, Florida, 108-115.
  • [18] Raghavan, S., Garcia-Molina, H. (2001). Crawling the Hidden Web. In Proceeding VLDB '01 Proceedings of the 27th International Conference on Very Large Data Bases, 11 – 14 September, San Francisco, CA, 129-138.
  • [19] Seitz, J. (2016). Dark Web OSINT with Python Part Two: SSH Keys and Shodan on Automating OSINT. Automating OSINT, http://www.automatingosint.com/blog/2016/08/dark-web-osint-with-python-part-two-ssh-keys-and-shodan/, (28.07.2019).

TOR Gizli Servis Tarayıcılarının Performans Karşılaştırması

Yıl 2019, Cilt: 6 Sayı: 2, 147 - 161, 26.12.2019
https://doi.org/10.35193/bseufbd.608555

Öz



TOR (The Onion Routing), kullanıcısına anonimliği sağlaması sebebiyle
son zamanlarda popülerliği artan ve onion uzantılı gizli servisler tarafından
sıklıkla tercih edilen bir ağ yapısıdır. Gizliliğin esas olması nedeniyle
dikkatleri üzerine çeken bu ağda, her geçen gün depolanan veri miktarı artmakta
bu da verilerin taranma ve analiz edilme durumlarını zorlaştırmaktadır. Ayrıca,
onion uzantılı servislerin taranması sırasında yapılan işlemin siber saldırı
olarak değerlendirilip ilgili adrese erişimin engellenme ihtimali de yüksektir.
Bu ağda yer alan servislerin (onion uzantılı web sayfaları) taranması ve
içeriklerine ulaşılabilmesi için çeşitli crawler yazılımları geliştirilmiştir.
Yalnız, burada yapılan tarama com, net, org gibi uzantılara sahip yüzey ağında
yer alan sayfaların taranmasından farklıdır. Çünkü TOR ağı, yüzey ağının alt
katmanlarında yer almakta ve buradaki sayfalara geleneksel tarayıcılar
(chrome,mozilla vb.) yerine yalnızca TOR tarayıcısı aracılığıyla
ulaşılmaktadır. Geliştirilen crawler yazılımlarında bu durum dikkate alınmış ve
gizliliği korumak adına, adreslere yapılan her istekte farklı düğümler
üzerinden yol seçimi yapılarak veri edinimine dikkat edilmiştir.



TOR ağında kullanıcıların gönderdiği her istekte farklı düğümler
üzerinden geçilerek hedef adrese ulaşılması bu ağı yavaşlatmaktadır. Buna
ilaveten, TOR üzerinden bilgi getirmeye çalışan bir tarayıcının performansının
düşük olması da uzun süreler beklemeyi beraberinde getirir. Bu yüzden tarama ve
bilgi elde etme hızı yüksek crawler yazılımları ile çalışmak araştırmacıların
analiz süreçlerini de iyileştirecektir. Bu alanda araştırma yapacak olan
kişileri yönlendirmesi ve crawler yazılımlarının birbirlerine karşı olan üstün
ve zayıf yönlerinin değerlendirilmesi açısından 4 farklı crawler yazılımı
çeşitli kriterlere göre değerlendirilmiştir. Gerçekleştirilen çalışma, TOR web
servislerinin analizini yapmak isteyen araştırmacıların ilk çıkış noktaları
anlamında doğru bir crawler yazılımını seçmeleri hususunda önemli bir bakış
açısı sunmaktadır.

Kaynakça

  • [1] AlKhatib, B., Basheer, R. (2019). Crawling the Dark Web: A Conceptual Perspective, Challenges and Implementation. Journal of Digital Information Management, 17(2), 51-60.
  • [2] Hoelscher, P. (2018). What is the Difference Between the Surface Web, the Deep Web, and the Dark Web? Infosec Resources, https://resources.infosecinstitute.com/what-is-the-difference-between-the-surface-web-the-deep-web-and-the-dark-web/#gref, (01.12.2019).
  • [3] Zabihimayvan, M., Sadeghi, R., Doran, D., Allahyari, M. (2019). A Broad Evaluation of the Tor English Content Ecosystem. In Proceedings on WebSci 2019, June 30–July 3, Boston, Massachusetts, 333-342.
  • [4] Owen, G., Savage, N. (2016). Empirical analysis of Tor Hidden Services. IET Information Security, 10(3), 113-118.
  • [5] Park, J., Mun, H., Lee, Y. (2018). Improving Tor Hidden Service Crawler Performance. In 2018 IEEE Conference on Dependable and Secure Computing (DSC), 10-13 December, Kaohsiung, Taiwan, 1-8.
  • [6] Casenove, M., Miraglia, A. (2014). Botnet over Tor: The illusion of hiding. In 2014 6th International Conference On Cyber Conflict, 3-6 June, Tallinn, Estonia, 273-282.
  • [7] Pundhir, S., Rafiq, M. Q. (2011). Performance Evaluation of Web Crawler. In IJCA Proceedings on International Conference on Emerging Technology Trends (ICETT), 43-46.
  • [8] Achsan, H. T. Y., Wibowo, W. C. (2014). A Fast Distributed Focused-Web Crawling. Procedia Engineering, 69, 492-499.
  • [9] Yadav, M., Goyal, N. (2015). Comparison of Open Source Crawlers-A Review. International Journal of Scientific & Engineering Research, 6, 1544-1551.
  • [10] Dikaiakos, M., Stassopoulou, A., Papageorgiou, L. (2003). Characterizing Crawler Behavior from Web Server Access Logs. In E-Commerce and Web Technologies, 2-5 September, Prague, 369-378.
  • [11] Zulkarnine, A. T., Frank, R., Monk, B., Mitchell, J., Davies, G. (2016). Surfacing collaborated networks in dark web to find illicit and criminal content. In 2016 IEEE Conference on Intelligence and Security Informatics (ISI), 28-30 September, Tucson, AZ, 109-114.
  • [12] Baravalle, A., Lopez, M. S., Lee, S. W. (2016). Mining the Dark Web: Drugs and Fake Ids. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), 12-15 December, Barcelona, 350-356.
  • [13] Kalpakis, G., Tsikrika, T., Iliou, C., Mironidis, T., Vrochidis, S., Middleton, J., Williamson, U., Kompatsiaris, I. (2016). Interactive Discovery and Retrieval of Web Resources Containing Home Made Explosive Recipes. In International Conference on Human Aspects of Information Security, Privacy, and Trust, 17 - 22 July, Toronto, 221-233.
  • [14] Iliou, C., Kalpakis, G., Tsikrika, T., Vrochidis, S., Kompatsiaris, I. (2016). Hybrid Focused Crawling for Homemade Explosives Discovery on Surface and Dark Web. In 2016 11th International Conference on Availability, Reliability and Security (ARES), 31 August-2 September, Salzburg, 229–234.
  • [15] Zhang, Y., Zeng, S., Huang, C., Fan, L., Yu, X., Dang, Y., A Larson, C., Denning, Roberts, N., Chen, H. (2010). Developing a Dark Web collection and infrastructure for computational and social sciences. In 2010 IEEE International Conference on Intelligence and Security Informatics, 23-26 May, Vancouver, BC, 59–64.
  • [16] Ghosh, S., Das, A., Porras, P., Yegneswaran, V., Gehani, A. (2017). Automated Categorization of Onion Sites for Analyzing the Darkweb Ecosystem. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2017, Halifax, Nova Scotia, 1793–1802.
  • [17] Pannu, M., Kay, I., Harris, D. (2018). Using Dark Web Crawler to Uncover Suspicious and Malicious Websites. In International Conference on Applied Human Factors and Ergonomics, 21-25 July, Orlando, Florida, 108-115.
  • [18] Raghavan, S., Garcia-Molina, H. (2001). Crawling the Hidden Web. In Proceeding VLDB '01 Proceedings of the 27th International Conference on Very Large Data Bases, 11 – 14 September, San Francisco, CA, 129-138.
  • [19] Seitz, J. (2016). Dark Web OSINT with Python Part Two: SSH Keys and Shodan on Automating OSINT. Automating OSINT, http://www.automatingosint.com/blog/2016/08/dark-web-osint-with-python-part-two-ssh-keys-and-shodan/, (28.07.2019).
Toplam 19 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Merve Varol Arısoy 0000-0003-2085-1964

Ecir Uğur Küçüksille 0000-0002-3293-9878

Yayımlanma Tarihi 26 Aralık 2019
Gönderilme Tarihi 21 Ağustos 2019
Kabul Tarihi 6 Aralık 2019
Yayımlandığı Sayı Yıl 2019 Cilt: 6 Sayı: 2

Kaynak Göster

APA Varol Arısoy, M., & Küçüksille, E. U. (2019). Performance Comparison of TOR Hidden Service Crawlers. Bilecik Şeyh Edebali Üniversitesi Fen Bilimleri Dergisi, 6(2), 147-161. https://doi.org/10.35193/bseufbd.608555