Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı
Yıl 2021,
Cilt: 14 Sayı: 2, 109 - 119, 22.12.2021
Hüseyin Ahmetoğlu
,
Resul Daş
Öz
İnternet üzerindeki uygulamalar kodlama kaynaklı bir takım güvenlik endişelerini barındırırlar. Zayıflıklar veya güvenlik açıkları, suçluların hassas verileri çalmak için veri tabanlarına doğrudan ve genel erişim elde etmesine olanak tanır. Bu çalışmada, web uygulama saldırılarının hibrit saldırı tespit sistemleri ile daha kolay ve daha doğru tespiti için sezgisel öznitelik seçimi ve makine öğrenmesine dayanan bir yaklaşım önerilmektedir. CIC-IDS2017 ve CSE-CIC-IDS2018 veri setlerindeki web uygulama saldırıları ve normal akış örnekleri bir dizi veri ön işleme aşaması sonrası birleştirilerek ve yeni bir veri seti oluşturuldu. Genetik Algoritma ve Lojistik Regresyon kullanılarak ortalama karesel hata ve öznitelik sayısı optimizasyonu gerçekleştirilip sonuçlar beş farklı makine öğrenmesi algoritması ile test edildi. Elde edilen sonuçlar incelendiğinde, öznitelik sayısının %85 oranında azaltılmasına rağmen sınıflandırmadaki başarım oranlarının %99 seviyesinde kaldığı gözlemlenmiştir.
Kaynakça
- K. Seyhan, T. N. Nguyen, S. Akleylek, K. Cengiz, and S. K. H. Islam, “Bi-GISIS KE: Modified key exchange protocol with reusable keys for IoT security,” Journal of Information Security and Applications, vol. 58, p. 102788, May 2021, doi: 10.1016/J.JISA.2021.102788.
- H. Ahmetoglu and R. Das, “Derin Öǧrenme ile Büyük Veri Kumelerinden Saldiri Türlerinin Siniflandirilmasi,” 2019. doi: 10.1109/IDAP.2019.8875872.
- “IDS 2017 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2017.html (accessed Oct. 27, 2021).
- “IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2018.html (accessed Oct. 27, 2021).
- S. M. Kasongo, “Genetic Algorithm Based Feature Selection Technique for Optimal Intrusion Detection,” no. June, pp. 1–22, 2021, doi: 10.20944/preprints202106.0710.v1.
- C. Khammassi and S. Krichen, “A GA-LR wrapper approach for feature selection in network intrusion detection,” Computers & Security, vol. 70, pp. 255–277, Sep. 2017, doi: 10.1016/J.COSE.2017.06.005.
- Y. Zhu, J. Liang, J. Chen, and Z. Ming, “An improved NSGA-III algorithm for feature selection used in intrusion detection,” Knowledge-Based Systems, vol. 116, pp. 74–85, Jan. 2017, doi: 10.1016/J.KNOSYS.2016.10.030.
- H. Ahmetoglu and R. Das, “Analysis of Feature Selection Approaches in Large Scale Cyber Intelligence Data with Deep Learning,” 2021. doi: 10.1109/siu49456.2020.9302200.
- H. Wang, J. Gu, and S. Wang, “An effective intrusion detection framework based on SVM with feature augmentation,” Knowledge-Based Systems, vol. 136, pp. 130–139, Nov. 2017, doi: 10.1016/J.KNOSYS.2017.09.014.
- H. Xu, Y. Fu, C. Fang, Q. Cao, J. Su, and S. Wei, “An improved binary whale optimization algorithm for feature selection of network intrusion detection,” Proceedings of the 2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems, IDAACS-SWS 2018, pp. 10–15, Nov. 2018, doi: 10.1109/IDAACS-SWS.2018.8525539.
- H. Gharaee and H. Hosseinvand, “A new feature selection IDS based on genetic algorithm and SVM,” 2016 8th International Symposium on Telecommunications, IST 2016, pp. 139–144, Mar. 2017, doi: 10.1109/ISTEL.2016.7881798.
- A. Thakkar and R. Lohiya, “Role of swarm and evolutionary algorithms for intrusion detection system: A survey,” Swarm and Evolutionary Computation, vol. 53, p. 100631, Mar. 2020, doi: 10.1016/J.SWEVO.2019.100631.
- S. Hosseini, “A new machine learning method consisting of GA-LR and ANN for attack detection,” Wireless Networks, vol. 26, no. 6, pp. 4149–4162, 2020, doi: 10.1007/s11276-020-02321-3.
- J. O. Onah, S. M. Abdulhamid, M. Abdullahi, I. H. Hassan, and A. Al-Ghusham, “Genetic Algorithm based feature selection and Naïve Bayes for anomaly detection in fog computing environment,” Machine Learning with Applications, vol. 6, no. April, p. 100156, 2021, doi: 10.1016/j.mlwa.2021.100156.
- Z. Halim et al., “An effective genetic algorithm-based feature selection method for intrusion detection systems,” Computers and Security, vol. 110, p. 102448, 2021, doi: 10.1016/j.cose.2021.102448.
- N. Moustafa and J. Slay, “A hybrid feature selection for network intrusion detection systems: Central points,” pp. 5–13, Jul. 2017, doi: 10.4225/75/57a84d4fbefbb.
- B. A. Tama, M. Comuzzi, and K. H. Rhee, “TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-Based Intrusion Detection System,” IEEE Access, vol. 7, pp. 94497–94507, 2019, doi: 10.1109/ACCESS.2019.2928048.
- S. M. Kasongo and Y. Sun, “A deep learning method with wrapper based feature extraction for wireless intrusion detection system,” Computers & Security, vol. 92, p. 101752, May 2020, doi: 10.1016/J.COSE.2020.101752.
- A. Nazir and R. A. Khan, “A novel combinatorial optimization based feature selection method for network intrusion detection,” Computers and Security, vol. 102, p. 102164, 2021, doi: 10.1016/j.cose.2020.102164.
- Ö. Kasim, “An ensemble classification-based approach to detect attack level of SQL injections,” Journal of Information Security and Applications, vol. 59, p. 102852, Jun. 2021, doi: 10.1016/J.JISA.2021.102852.
- I. Tariq, M. A. Sindhu, R. A. Abbasi, A. S. Khattak, O. Maqbool, and G. F. Siddiqui, “Resolving cross-site scripting attacks through genetic algorithm and reinforcement learning,” Expert Systems with Applications, vol. 168, p. 114386, Apr. 2021, doi: 10.1016/J.ESWA.2020.114386.
- A. B. Puthuparambil and J. J. Thomas, “Freestyle, a randomized version of ChaCha for resisting offline brute-force and dictionary attacks,” Journal of Information Security and Applications, vol. 49, p. 102396, Dec. 2019, doi: 10.1016/J.JISA.2019.102396.
- D. Ö. Şahin, O. E. Kural, S. Akleylek, and E. Kılıç, “A novel Android malware detection system: adaption of filter-based feature selection methods,” Journal of Ambient Intelligence and Humanized Computing 2021, vol. 1, pp. 1–15, Jul. 2021, doi: 10.1007/S12652-021-03376-6.
- M. DASH and H. LIU, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, no. 1–4, pp. 131–156, Jan. 1997, doi: 10.1016/S1088-467X(97)00008-5.
- I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in ICISSP 2018 - Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018, vol. 2018-Janua, pp. 108–116. doi: 10.5220/0006639801080116.
- R. Zuech, J. Hancock, and T. M. Khoshgoftaar, “Detecting web attacks using random undersampling and ensemble learners,” Journal of Big Data, vol. 8, no. 1, 2021, doi: 10.1186/s40537-021-00460-8.
- A. H. Lashkari, G. D. Gil, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of tor traffic using time based features,” ICISSP 2017 - Proceedings of the 3rd International Conference on Information Systems Security and Privacy, vol. 2017-Janua, pp. 253–262, 2017, doi: 10.5220/0006105602530262.
- “Applications | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/research/applications.html#CICFlowMeter (accessed Oct. 28, 2021).
Genetic Feature Selection Approach in Detection of Web Application Attacks Using Machine Learning Methods
Yıl 2021,
Cilt: 14 Sayı: 2, 109 - 119, 22.12.2021
Hüseyin Ahmetoğlu
,
Resul Daş
Öz
Applications on the Internet have some coding-related security concerns. Weaknesses or vulnerabilities allow criminals to gain direct and public access to databases to steal sensitive data. This study proposes an approach based on heuristic feature selection and machine learning for easier and more accurate detection of web application attacks with hybrid intrusion detection systems. Web application attacks and benign flow examples in CIC-IDS2017 and CSE-CIC-IDS2018 datasets were combined after a series of data preprocessing stages, and a new dataset was created. Using Genetic Algorithm and Logistic Regression, mean square error and feature count optimization were performed, and the results were tested with five different machine learning algorithms. When the results obtained were examined, it was observed that the success rate in classification remained at the level of 99%, although the number of features was reduced by 85%
Kaynakça
- K. Seyhan, T. N. Nguyen, S. Akleylek, K. Cengiz, and S. K. H. Islam, “Bi-GISIS KE: Modified key exchange protocol with reusable keys for IoT security,” Journal of Information Security and Applications, vol. 58, p. 102788, May 2021, doi: 10.1016/J.JISA.2021.102788.
- H. Ahmetoglu and R. Das, “Derin Öǧrenme ile Büyük Veri Kumelerinden Saldiri Türlerinin Siniflandirilmasi,” 2019. doi: 10.1109/IDAP.2019.8875872.
- “IDS 2017 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2017.html (accessed Oct. 27, 2021).
- “IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2018.html (accessed Oct. 27, 2021).
- S. M. Kasongo, “Genetic Algorithm Based Feature Selection Technique for Optimal Intrusion Detection,” no. June, pp. 1–22, 2021, doi: 10.20944/preprints202106.0710.v1.
- C. Khammassi and S. Krichen, “A GA-LR wrapper approach for feature selection in network intrusion detection,” Computers & Security, vol. 70, pp. 255–277, Sep. 2017, doi: 10.1016/J.COSE.2017.06.005.
- Y. Zhu, J. Liang, J. Chen, and Z. Ming, “An improved NSGA-III algorithm for feature selection used in intrusion detection,” Knowledge-Based Systems, vol. 116, pp. 74–85, Jan. 2017, doi: 10.1016/J.KNOSYS.2016.10.030.
- H. Ahmetoglu and R. Das, “Analysis of Feature Selection Approaches in Large Scale Cyber Intelligence Data with Deep Learning,” 2021. doi: 10.1109/siu49456.2020.9302200.
- H. Wang, J. Gu, and S. Wang, “An effective intrusion detection framework based on SVM with feature augmentation,” Knowledge-Based Systems, vol. 136, pp. 130–139, Nov. 2017, doi: 10.1016/J.KNOSYS.2017.09.014.
- H. Xu, Y. Fu, C. Fang, Q. Cao, J. Su, and S. Wei, “An improved binary whale optimization algorithm for feature selection of network intrusion detection,” Proceedings of the 2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems, IDAACS-SWS 2018, pp. 10–15, Nov. 2018, doi: 10.1109/IDAACS-SWS.2018.8525539.
- H. Gharaee and H. Hosseinvand, “A new feature selection IDS based on genetic algorithm and SVM,” 2016 8th International Symposium on Telecommunications, IST 2016, pp. 139–144, Mar. 2017, doi: 10.1109/ISTEL.2016.7881798.
- A. Thakkar and R. Lohiya, “Role of swarm and evolutionary algorithms for intrusion detection system: A survey,” Swarm and Evolutionary Computation, vol. 53, p. 100631, Mar. 2020, doi: 10.1016/J.SWEVO.2019.100631.
- S. Hosseini, “A new machine learning method consisting of GA-LR and ANN for attack detection,” Wireless Networks, vol. 26, no. 6, pp. 4149–4162, 2020, doi: 10.1007/s11276-020-02321-3.
- J. O. Onah, S. M. Abdulhamid, M. Abdullahi, I. H. Hassan, and A. Al-Ghusham, “Genetic Algorithm based feature selection and Naïve Bayes for anomaly detection in fog computing environment,” Machine Learning with Applications, vol. 6, no. April, p. 100156, 2021, doi: 10.1016/j.mlwa.2021.100156.
- Z. Halim et al., “An effective genetic algorithm-based feature selection method for intrusion detection systems,” Computers and Security, vol. 110, p. 102448, 2021, doi: 10.1016/j.cose.2021.102448.
- N. Moustafa and J. Slay, “A hybrid feature selection for network intrusion detection systems: Central points,” pp. 5–13, Jul. 2017, doi: 10.4225/75/57a84d4fbefbb.
- B. A. Tama, M. Comuzzi, and K. H. Rhee, “TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-Based Intrusion Detection System,” IEEE Access, vol. 7, pp. 94497–94507, 2019, doi: 10.1109/ACCESS.2019.2928048.
- S. M. Kasongo and Y. Sun, “A deep learning method with wrapper based feature extraction for wireless intrusion detection system,” Computers & Security, vol. 92, p. 101752, May 2020, doi: 10.1016/J.COSE.2020.101752.
- A. Nazir and R. A. Khan, “A novel combinatorial optimization based feature selection method for network intrusion detection,” Computers and Security, vol. 102, p. 102164, 2021, doi: 10.1016/j.cose.2020.102164.
- Ö. Kasim, “An ensemble classification-based approach to detect attack level of SQL injections,” Journal of Information Security and Applications, vol. 59, p. 102852, Jun. 2021, doi: 10.1016/J.JISA.2021.102852.
- I. Tariq, M. A. Sindhu, R. A. Abbasi, A. S. Khattak, O. Maqbool, and G. F. Siddiqui, “Resolving cross-site scripting attacks through genetic algorithm and reinforcement learning,” Expert Systems with Applications, vol. 168, p. 114386, Apr. 2021, doi: 10.1016/J.ESWA.2020.114386.
- A. B. Puthuparambil and J. J. Thomas, “Freestyle, a randomized version of ChaCha for resisting offline brute-force and dictionary attacks,” Journal of Information Security and Applications, vol. 49, p. 102396, Dec. 2019, doi: 10.1016/J.JISA.2019.102396.
- D. Ö. Şahin, O. E. Kural, S. Akleylek, and E. Kılıç, “A novel Android malware detection system: adaption of filter-based feature selection methods,” Journal of Ambient Intelligence and Humanized Computing 2021, vol. 1, pp. 1–15, Jul. 2021, doi: 10.1007/S12652-021-03376-6.
- M. DASH and H. LIU, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, no. 1–4, pp. 131–156, Jan. 1997, doi: 10.1016/S1088-467X(97)00008-5.
- I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in ICISSP 2018 - Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018, vol. 2018-Janua, pp. 108–116. doi: 10.5220/0006639801080116.
- R. Zuech, J. Hancock, and T. M. Khoshgoftaar, “Detecting web attacks using random undersampling and ensemble learners,” Journal of Big Data, vol. 8, no. 1, 2021, doi: 10.1186/s40537-021-00460-8.
- A. H. Lashkari, G. D. Gil, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of tor traffic using time based features,” ICISSP 2017 - Proceedings of the 3rd International Conference on Information Systems Security and Privacy, vol. 2017-Janua, pp. 253–262, 2017, doi: 10.5220/0006105602530262.
- “Applications | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/research/applications.html#CICFlowMeter (accessed Oct. 28, 2021).