Research Article
BibTex RIS Cite

Web Sitelerinde Gerçekleştirilen Oltalama Saldırılarının Yapay Zekâ Yaklaşımı ile Tespiti

Year 2021, Volume: 10 Issue: 4, 1603 - 1614, 31.12.2021
https://doi.org/10.17798/bitlisfen.988001

Abstract

Oltalama, kişisel bilgilerin internet üzerinden çalınmasına yönelik gerçekleştirilen yazılım tabanlı saldırılardır. Oltalama saldırılarında genellikle kişilerin kimlik bilgileri, kullanıcı parolaları, kredi veya banka kartı bilgileri gibi özel bilgilerin ele geçirilmesi amaçlanır. Bunun için en uygun ortam olarak genelde özel yazılım kodları içeren web sitesi uygulamaları veya elektronik posta sistemleri tercih edilir. Bu tür net uygulamalarında gelen cezbedici görsel veya metin tabanlı iletiler bireyleri yemleyerek saldırıların gerçekleştirilmesini sağlar. Milyarlarca insanın etkileşim içerisinde olduğu internet ortamında bu tür saldırıların önlemini zamanında alabilmek için teknolojik gelişmelerle paralel hareket etmek gerekir. Son zamanlarda, yapay zekâ teknolojileri internet güvenliği alanında adını duyurmayı başarmıştır. Bu çalışmada, makine öğrenme yöntemleri ile 11 binin üzerinde web sitesi incelenmiş ve oltalama saldırısı yapan web siteleri tespit edildi. Veri seti, 30 web parametresinden oluşmaktadır ve açık erişimlidir. Makine öğrenmesi yöntemleri ile her bir web sitesi için 30 özellik incelendi; oltalama saldırısını gerçekleştiren web siteleri ile gerçekleştirmeyen web siteleri sınıflandırıldı. Sonuç olarak, en iyi test doğruluk başarısı Rastgele Orman yöntemi ile %96,53 oranında gerçekleştirildi.

References

  • Önal H. 2021. Phishing (Oltalama) Saldırısı Nedir? | BGA Security. In: BGA Secur. https://www.bgasecurity.com/2019/09/phishing-oltalama-saldirisi-nedir/. (Erişim: 10 Haziran 2021).
  • Wei B., Hamad R.A., Yang L., vd. 2019. A Deep-Learning-Driven Light-Weight Phishing Detection Sensor. Sensors (Basel) 19:4258. https://doi.org/10.3390/s19194258.
  • Phishing Statistics: The 29 Latest Phishing Stats to Know in 2020 - Hashed Out by The SSL StoreTM. In: Hashedout. https://www.thesslstore.com/blog/phishing-statistics-latest-phishing-stats-to-know/. (Erişim: 19 Haziran 2021).
  • Abdelhamid M. 2020. The Role of Health Concerns in Phishing Susceptibility: Survey Design Study. J Med Internet Res 22:e18394. https://doi.org/10.2196/18394
  • Yi P., Guan Y., Zou F., vd. 2018. Web Phishing Detection Using a Deep Learning Framework. Wirel Commun Mob Comput 2018:4678746. https://doi.org/10.1155/2018/4678746.
  • Kaytan M., Hanbay D. 2017. Effective Classification of Phishing Web Pages Based on New Rules by Using Extreme Learning Machines. Anatol J Comput Sci 2:15–36.
  • Sonowal G. 2020. Phishing Email Detection Based on Binary Search Feature Selection. SN Comput Sci 1:191. https://doi.org/10.1007/s42979-020-00194-z.
  • Chand E. 2021. Phishing website Detector. In: Kaggle. https://www.kaggle.com/eswarchandt/phishing-website-detector. (Erişim: 7 Haziran 2021).
  • Huang S., Cai N., Pacheco P.P., vd. 2017. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genomics Proteomics 15:41–51. https://doi.org/10.21873/cgp.20063.
  • Sertkaya M.E., Ergen B., Togacar M. 2019. Diagnosis of Eye Retinal Diseases Based on Convolutional Neural Networks Using Optical Coherence Images. In: 2019 23rd International Conference Electronics. pp 1–5.
  • Erdoğmuş P., Çolak B., Durdağ Z. 2016. K-Means Algoritması İle Otomatik Kümeleme. El-Cezeri J. Sci. Eng. 3:0.
  • Moghtadaiee V., Dempster A.G. 2015. Determining the best vector distance measure for use in location fingerprinting. Pervasive Mob Comput 23:59–79. https://doi.org/https://doi.org/10.1016/j.pmcj.2014.11.002.
  • Topîrceanu A., Grosseck G. 2017. Decision tree learning used for the classification of student archetypes in online courses. Procedia Comput Sci 112:51–60. https://doi.org/https://doi.org/10.1016/j.procs.2017.08.021.
  • Bulut F. 2017. Different Mathematical Models for Entropy in Information Theory. Bilgi Kuramı ndaki Entropi Kavramıyla İlgili Farklı Matematiksel Modeller. 1:167–174.
  • Seifert S. 2020. Application of random forest based approaches to surface-enhanced Raman scattering data. Sci Rep 10:5436. https://doi.org/10.1038/s41598-020-62338-8.
  • Aldrich C. 2020. Process variable importance analysis by use of random forests in a shapley regression framework. Minerals 10:1–17. https://doi.org/10.3390/min10050420.
  • Khan S.A. 2020. Phishing Websites Classification using Deep Learning. In: GitHub. https://github.com/sohailahmedkhan173/Phishing-Websites-Classification-using-Deep-Learning. (Erişim: 9 Haziran 2021).
  • Google Colab Notebooks- Colaboratory. In: Google. https://colab.research.google.com/notebooks/intro.ipynb. (Erişim: 9 Haziran 2021).
  • Tumen V., Yildirim O., Ergen B. 2018. Recognition of road type and quality for advanced driver assistance systems with deep learning. Elektron ir Elektrotechnika 24: 67–74.
  • Tümen V., Ergen B. 2020. Intersections and crosswalk detection using deep learning and image processing techniques. Physica A: Statistical Mechanics and its Applications, 543: 123510.
  • Sahingoz Ö.K., Buber E., Demir Ö., Diri B. 2019. Machine learning based phishing detection from URLs. Expert Systems with Applications 117:345–357. https://doi.org/10.1016/j.eswa.2018.09.029.
  • Koşan M.A., Yıldız O., Karacan H. 2018. Comparative analysis of machine learning algorithms in detection of phishing websites. Pamukkale University Journal of Engineering Sciences 24(2):276–282. https://doi.org/10.5505/pajes.2017.27167.
  • Lin T., Capecci D.E., Ellis D.M., vd. 2019. Susceptibility to Spear-Phishing Emails: Effects of Internet User Demographics and Email Content. ACM Trans Comput Hum Interact 26:32. https://doi.org/10.1145/3336141.

Detection of Phishing Attacks on Websites Using Artificial Intelligence Approach

Year 2021, Volume: 10 Issue: 4, 1603 - 1614, 31.12.2021
https://doi.org/10.17798/bitlisfen.988001

Abstract

Phishing is software-based attacks on the stealing of personal information over the internet. In phishing attacks, it is generally aimed to capture private information such as personal identification information, user passwords, credit or debit card information. Website applications or electronic mail systems containing special software codes are generally preferred as the most suitable medium for this. In this kind of net applications, attractive visual or text based messages feed individuals and enable attacks. It is necessary to act in parallel with the technological developments in order to prevent such attacks on time in the internet environment where billions of people interact. Recently, artificial intelligence technologies have managed to make a name in the field of internet security. In this study, over 11 thousand websites were analyzed with machine learning methods and websites that made phishing attacks were determined. The dataset consists of 30 web parameters and is open access. With machine learning methods, 30 features were examined for each website; web sites that carry out the phishing attack and those that did not. As a result, the best test accuracy achievement was realized by Random Forest method at 96.53%.

References

  • Önal H. 2021. Phishing (Oltalama) Saldırısı Nedir? | BGA Security. In: BGA Secur. https://www.bgasecurity.com/2019/09/phishing-oltalama-saldirisi-nedir/. (Erişim: 10 Haziran 2021).
  • Wei B., Hamad R.A., Yang L., vd. 2019. A Deep-Learning-Driven Light-Weight Phishing Detection Sensor. Sensors (Basel) 19:4258. https://doi.org/10.3390/s19194258.
  • Phishing Statistics: The 29 Latest Phishing Stats to Know in 2020 - Hashed Out by The SSL StoreTM. In: Hashedout. https://www.thesslstore.com/blog/phishing-statistics-latest-phishing-stats-to-know/. (Erişim: 19 Haziran 2021).
  • Abdelhamid M. 2020. The Role of Health Concerns in Phishing Susceptibility: Survey Design Study. J Med Internet Res 22:e18394. https://doi.org/10.2196/18394
  • Yi P., Guan Y., Zou F., vd. 2018. Web Phishing Detection Using a Deep Learning Framework. Wirel Commun Mob Comput 2018:4678746. https://doi.org/10.1155/2018/4678746.
  • Kaytan M., Hanbay D. 2017. Effective Classification of Phishing Web Pages Based on New Rules by Using Extreme Learning Machines. Anatol J Comput Sci 2:15–36.
  • Sonowal G. 2020. Phishing Email Detection Based on Binary Search Feature Selection. SN Comput Sci 1:191. https://doi.org/10.1007/s42979-020-00194-z.
  • Chand E. 2021. Phishing website Detector. In: Kaggle. https://www.kaggle.com/eswarchandt/phishing-website-detector. (Erişim: 7 Haziran 2021).
  • Huang S., Cai N., Pacheco P.P., vd. 2017. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genomics Proteomics 15:41–51. https://doi.org/10.21873/cgp.20063.
  • Sertkaya M.E., Ergen B., Togacar M. 2019. Diagnosis of Eye Retinal Diseases Based on Convolutional Neural Networks Using Optical Coherence Images. In: 2019 23rd International Conference Electronics. pp 1–5.
  • Erdoğmuş P., Çolak B., Durdağ Z. 2016. K-Means Algoritması İle Otomatik Kümeleme. El-Cezeri J. Sci. Eng. 3:0.
  • Moghtadaiee V., Dempster A.G. 2015. Determining the best vector distance measure for use in location fingerprinting. Pervasive Mob Comput 23:59–79. https://doi.org/https://doi.org/10.1016/j.pmcj.2014.11.002.
  • Topîrceanu A., Grosseck G. 2017. Decision tree learning used for the classification of student archetypes in online courses. Procedia Comput Sci 112:51–60. https://doi.org/https://doi.org/10.1016/j.procs.2017.08.021.
  • Bulut F. 2017. Different Mathematical Models for Entropy in Information Theory. Bilgi Kuramı ndaki Entropi Kavramıyla İlgili Farklı Matematiksel Modeller. 1:167–174.
  • Seifert S. 2020. Application of random forest based approaches to surface-enhanced Raman scattering data. Sci Rep 10:5436. https://doi.org/10.1038/s41598-020-62338-8.
  • Aldrich C. 2020. Process variable importance analysis by use of random forests in a shapley regression framework. Minerals 10:1–17. https://doi.org/10.3390/min10050420.
  • Khan S.A. 2020. Phishing Websites Classification using Deep Learning. In: GitHub. https://github.com/sohailahmedkhan173/Phishing-Websites-Classification-using-Deep-Learning. (Erişim: 9 Haziran 2021).
  • Google Colab Notebooks- Colaboratory. In: Google. https://colab.research.google.com/notebooks/intro.ipynb. (Erişim: 9 Haziran 2021).
  • Tumen V., Yildirim O., Ergen B. 2018. Recognition of road type and quality for advanced driver assistance systems with deep learning. Elektron ir Elektrotechnika 24: 67–74.
  • Tümen V., Ergen B. 2020. Intersections and crosswalk detection using deep learning and image processing techniques. Physica A: Statistical Mechanics and its Applications, 543: 123510.
  • Sahingoz Ö.K., Buber E., Demir Ö., Diri B. 2019. Machine learning based phishing detection from URLs. Expert Systems with Applications 117:345–357. https://doi.org/10.1016/j.eswa.2018.09.029.
  • Koşan M.A., Yıldız O., Karacan H. 2018. Comparative analysis of machine learning algorithms in detection of phishing websites. Pamukkale University Journal of Engineering Sciences 24(2):276–282. https://doi.org/10.5505/pajes.2017.27167.
  • Lin T., Capecci D.E., Ellis D.M., vd. 2019. Susceptibility to Spear-Phishing Emails: Effects of Internet User Demographics and Email Content. ACM Trans Comput Hum Interact 26:32. https://doi.org/10.1145/3336141.
There are 23 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Araştırma Makalesi
Authors

Mesut Toğaçar 0000-0002-8264-3899

Publication Date December 31, 2021
Submission Date August 27, 2021
Acceptance Date October 15, 2021
Published in Issue Year 2021 Volume: 10 Issue: 4

Cite

IEEE M. Toğaçar, “Web Sitelerinde Gerçekleştirilen Oltalama Saldırılarının Yapay Zekâ Yaklaşımı ile Tespiti”, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, vol. 10, no. 4, pp. 1603–1614, 2021, doi: 10.17798/bitlisfen.988001.

Bitlis Eren University
Journal of Science Editor
Bitlis Eren University Graduate Institute
Bes Minare Mah. Ahmet Eren Bulvari, Merkez Kampus, 13000 BITLIS