Comparative Analysis of Globalisation Techniques for Medical Document Classification

Bekir Parlak; Salih Berkan Aydemir

doi:10.55195/jscai.1216800

Research Article

Comparative Analysis of Globalisation Techniques for Medical Document Classification

Year 2023, Volume: 4 Issue: 1, 7 - 14, 25.06.2023

Bekir Parlak , Salih Berkan Aydemir

https://doi.org/10.55195/jscai.1216800

Abstract

Medical document classification is one of the important topics of text mining. Globalisation techniques play a major role in text classification. It is also known that globalisation techniques play an important role in text classification. Our aim in the study is to conduct a detailed analysis on two data sets with English and Turkish content by using medical text summaries of Turkish articles. These datasets consist of Turkish and English text summaries of the same articles. To observe how successful local feature selection methods in the field of text classification affect the classification performance on these two equivalent data sets by applying different globalisation techniques. The feature selection methods used are CHI2, MI, OR, WLLR. Globalisation techniques are SUM, AVG, MAX. Classifiers are MNB, DT, and SVM.

Keywords

Medical documents, Text Classification, Feature selection, Globalisation techniques

References

P. Slipenchuk and A. Epishkina, “Practical User and Entity Behavior Analytics Methods for Fraud Detection Systems in Online Banking: A Survey,” in Biologically Inspired Cognitive Architectures 2019, Cham, 2020, pp. 83–93. doi: 10.1007/978-3-030-25719-4_11.
E. T. Anumol, “Use of Machine Learning Algorithms with SIEM for Attack Prediction,” in Intelligent Computing, Communication and Devices, New Delhi, 2015, pp. 231–235. doi: 10.1007/978-81-322-2012-1_24.
T. Laue, T. Klecker, C. Kleiner, and K.-O. Detken, “A SIEM Architecture for Advanced Anomaly Detection,” vol. 6, no. 1, p. 17, 2022.
S. Asanger and A. Hutchison, “Experiences and Challenges in Enhancing Security Information and Event Management Capability Using Unsupervised Anomaly Detection,” in 2013 International Conference on Availability, Reliability and Security, Sep. 2013, pp. 654–661. doi: 10.1109/ARES.2013.86.
M. Goldstein, S. Asanger, M. Reif, and A. Hutchison, “Enhancing Security Event Management Systems with Unsupervised Anomaly Detection:,” in Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods, Barcelona, Spain, 2013, pp. 530–538. doi: 10.5220/0004230105300538.
A. Lukashin, M. Popov, A. Bolshakov, and Y. Nikolashin, “Scalable Data Processing Approach and Anomaly Detection Method for User and Entity Behavior Analytics Platform,” in Intelligent Distributed Computing XIII, Cham, 2020, pp. 344–349. doi: 10.1007/978-3-030-32258-8_40.
Z. Tian, C. Luo, H. Lu, S. Su, Y. Sun, and M. Zhang, “User and Entity Behavior Analysis under Urban Big Data,” ACMIMS Trans. Data Sci., vol. 1, no. 3, p. 16:1-16:19, Sep. 2020, doi: 10.1145/3374749.
D. C. Le and A. N. Zincir-Heywood, “Evaluating Insider Threat Detection Workflow Using Supervised and Unsupervised Learning,” in 2018 IEEE Security and Privacy Workshops (SPW), May 2018, pp. 270–275. doi: 10.1109/SPW.2018.00043.
B. Sharma, P. Pokharel, and B. Joshi, “User Behavior Analytics for Anomaly Detection Using LSTM Autoencoder - Insider Threat Detection,” in Proceedings of the 11th International Conference on Advances in Information Technology, New York, NY, USA, Jul. 2020, pp. 1–9. doi: 10.1145/3406601.3406610.
T. Al-Shehari and R. A. Alsowail, “An Insider Data Leakage Detection Using One-Hot Encoding, Synthetic Minority Oversampling and Machine Learning Techniques,” Entropy, vol. 23, no. 10, Art. no. 10, Oct. 2021, doi: 10.3390/e23101258.
M. Dosh, “Detecting insider threat within institutions using CERT dataset and different ML techniques,” Period. Eng. Nat. Sci. PEN, vol. 9, no. 2, Art. no. 2, May 2021, doi: 10.21533/pen.v9i2.1911.
M. Shashanka, M.-Y. Shen, and J. Wang, “User and entity behavior analytics for enterprise security,” in 2016 IEEE International Conference on Big Data (Big Data), Dec. 2016, pp. 1867–1874. doi: 10.1109/BigData.2016.7840805.
O. Carlsson and D. Nabhani, “User and Entity Behavior Anomaly Detection using Network Traﬃc,” p. 52.
“Insider Threat Test Dataset.” Carnegie Mellon University, Sep. 30, 2020. doi: 10.1184/R1/12841247.v1.
“Arge-Preprocessing-CERT.” Detaysoft, Oct. 23, 2022. Accessed: Oct. 23, 2022. [Online]. Available: https://github.com/Detaysoft/Arge-Preprocessing-CERT
W. R. Claycomb and A. Nicoll, “Insider Threats to Cloud Computing: Directions for New Research Challenges,” in 2012 IEEE 36th Annual Computer Software and Applications Conference, Jul. 2012, pp. 387–394. doi: 10.1109/COMPSAC.2012.113.
“Big Five personality traits,” Wikipedia. Oct. 07, 2022. Accessed: Oct. 20, 2022. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Big_Five_personality_traits&oldid=1114671408
J. Glasser and B. Lindauer, “Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data,” in 2013 IEEE Security and Privacy Workshops, May 2013, pp. 98–104. doi: 10.1109/SPW.2013.37.
R. Abdulhammed, M. Faezipour, A. Abuzneid, and A. AbuMallouh, “Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic,” IEEE Sens. Lett., vol. 3, no. 1, pp. 1–4, Jan. 2019, doi: 10.1109/LSENS.2018.2879990.
A. Hassan and A. Mahmood, “Deep Learning approach for sentiment analysis of short texts,” in 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), Apr. 2017, pp. 705–710. doi: 10.1109/ICCAR.2017.7942788.
Y. Görmez, M. Sabzekar, and Z. Aydın, “IGPRED: Combination of convolutional neural and graph convolutional networks for protein secondary structure prediction,” Proteins Struct. Funct. Bioinforma., vol. 89, no. 10, pp. 1277–1288, 2021, doi: 10.1002/prot.26149.
X. Hou, T. Arslan, A. Juri, and F. Wang, “Indoor Localization for Bluetooth Low Energy Devices Using Weighted Off-set Triangulation Algorithm,” presented at the Proceedings of the 29th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2016), Sep. 2016, pp. 2286–2292. doi: 10.33012/2016.14720.
Z. Wang, S. Sugaya, and D. P. T. Nguyen, “Salary Prediction using Bidirectional-GRU-CNN Model,” p. 4, 2019.

Year 2023, Volume: 4 Issue: 1, 7 - 14, 25.06.2023

Bekir Parlak , Salih Berkan Aydemir

https://doi.org/10.55195/jscai.1216800

Abstract

References

P. Slipenchuk and A. Epishkina, “Practical User and Entity Behavior Analytics Methods for Fraud Detection Systems in Online Banking: A Survey,” in Biologically Inspired Cognitive Architectures 2019, Cham, 2020, pp. 83–93. doi: 10.1007/978-3-030-25719-4_11.
E. T. Anumol, “Use of Machine Learning Algorithms with SIEM for Attack Prediction,” in Intelligent Computing, Communication and Devices, New Delhi, 2015, pp. 231–235. doi: 10.1007/978-81-322-2012-1_24.
T. Laue, T. Klecker, C. Kleiner, and K.-O. Detken, “A SIEM Architecture for Advanced Anomaly Detection,” vol. 6, no. 1, p. 17, 2022.
S. Asanger and A. Hutchison, “Experiences and Challenges in Enhancing Security Information and Event Management Capability Using Unsupervised Anomaly Detection,” in 2013 International Conference on Availability, Reliability and Security, Sep. 2013, pp. 654–661. doi: 10.1109/ARES.2013.86.
M. Goldstein, S. Asanger, M. Reif, and A. Hutchison, “Enhancing Security Event Management Systems with Unsupervised Anomaly Detection:,” in Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods, Barcelona, Spain, 2013, pp. 530–538. doi: 10.5220/0004230105300538.
A. Lukashin, M. Popov, A. Bolshakov, and Y. Nikolashin, “Scalable Data Processing Approach and Anomaly Detection Method for User and Entity Behavior Analytics Platform,” in Intelligent Distributed Computing XIII, Cham, 2020, pp. 344–349. doi: 10.1007/978-3-030-32258-8_40.
Z. Tian, C. Luo, H. Lu, S. Su, Y. Sun, and M. Zhang, “User and Entity Behavior Analysis under Urban Big Data,” ACMIMS Trans. Data Sci., vol. 1, no. 3, p. 16:1-16:19, Sep. 2020, doi: 10.1145/3374749.
D. C. Le and A. N. Zincir-Heywood, “Evaluating Insider Threat Detection Workflow Using Supervised and Unsupervised Learning,” in 2018 IEEE Security and Privacy Workshops (SPW), May 2018, pp. 270–275. doi: 10.1109/SPW.2018.00043.
B. Sharma, P. Pokharel, and B. Joshi, “User Behavior Analytics for Anomaly Detection Using LSTM Autoencoder - Insider Threat Detection,” in Proceedings of the 11th International Conference on Advances in Information Technology, New York, NY, USA, Jul. 2020, pp. 1–9. doi: 10.1145/3406601.3406610.
T. Al-Shehari and R. A. Alsowail, “An Insider Data Leakage Detection Using One-Hot Encoding, Synthetic Minority Oversampling and Machine Learning Techniques,” Entropy, vol. 23, no. 10, Art. no. 10, Oct. 2021, doi: 10.3390/e23101258.
M. Dosh, “Detecting insider threat within institutions using CERT dataset and different ML techniques,” Period. Eng. Nat. Sci. PEN, vol. 9, no. 2, Art. no. 2, May 2021, doi: 10.21533/pen.v9i2.1911.
M. Shashanka, M.-Y. Shen, and J. Wang, “User and entity behavior analytics for enterprise security,” in 2016 IEEE International Conference on Big Data (Big Data), Dec. 2016, pp. 1867–1874. doi: 10.1109/BigData.2016.7840805.
O. Carlsson and D. Nabhani, “User and Entity Behavior Anomaly Detection using Network Traﬃc,” p. 52.
“Insider Threat Test Dataset.” Carnegie Mellon University, Sep. 30, 2020. doi: 10.1184/R1/12841247.v1.
“Arge-Preprocessing-CERT.” Detaysoft, Oct. 23, 2022. Accessed: Oct. 23, 2022. [Online]. Available: https://github.com/Detaysoft/Arge-Preprocessing-CERT
W. R. Claycomb and A. Nicoll, “Insider Threats to Cloud Computing: Directions for New Research Challenges,” in 2012 IEEE 36th Annual Computer Software and Applications Conference, Jul. 2012, pp. 387–394. doi: 10.1109/COMPSAC.2012.113.
“Big Five personality traits,” Wikipedia. Oct. 07, 2022. Accessed: Oct. 20, 2022. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Big_Five_personality_traits&oldid=1114671408
J. Glasser and B. Lindauer, “Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data,” in 2013 IEEE Security and Privacy Workshops, May 2013, pp. 98–104. doi: 10.1109/SPW.2013.37.
R. Abdulhammed, M. Faezipour, A. Abuzneid, and A. AbuMallouh, “Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic,” IEEE Sens. Lett., vol. 3, no. 1, pp. 1–4, Jan. 2019, doi: 10.1109/LSENS.2018.2879990.
A. Hassan and A. Mahmood, “Deep Learning approach for sentiment analysis of short texts,” in 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), Apr. 2017, pp. 705–710. doi: 10.1109/ICCAR.2017.7942788.
Y. Görmez, M. Sabzekar, and Z. Aydın, “IGPRED: Combination of convolutional neural and graph convolutional networks for protein secondary structure prediction,” Proteins Struct. Funct. Bioinforma., vol. 89, no. 10, pp. 1277–1288, 2021, doi: 10.1002/prot.26149.
X. Hou, T. Arslan, A. Juri, and F. Wang, “Indoor Localization for Bluetooth Low Energy Devices Using Weighted Off-set Triangulation Algorithm,” presented at the Proceedings of the 29th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2016), Sep. 2016, pp. 2286–2292. doi: 10.33012/2016.14720.
Z. Wang, S. Sugaya, and D. P. T. Nguyen, “Salary Prediction using Bidirectional-GRU-CNN Model,” p. 4, 2019.

There are 23 citations in total.

Details

Primary Language	English
Subjects	Computer Software
Journal Section	Research Articles
Authors	Bekir Parlak 0000-0001-8919-6481 Salih Berkan Aydemir 0000-0003-0069-3479
Early Pub Date	June 30, 2023
Publication Date	June 25, 2023
Submission Date	December 9, 2022
Published in Issue	Year 2023 Volume: 4 Issue: 1

Cite

APA	Parlak, B., & Aydemir, S. B. (2023). Comparative Analysis of Globalisation Techniques for Medical Document Classification. Journal of Soft Computing and Artificial Intelligence, 4(1), 7-14. https://doi.org/10.55195/jscai.1216800
AMA	Parlak B, Aydemir SB. Comparative Analysis of Globalisation Techniques for Medical Document Classification. JSCAI. June 2023;4(1):7-14. doi:10.55195/jscai.1216800
Chicago	Parlak, Bekir, and Salih Berkan Aydemir. “Comparative Analysis of Globalisation Techniques for Medical Document Classification”. Journal of Soft Computing and Artificial Intelligence 4, no. 1 (June 2023): 7-14. https://doi.org/10.55195/jscai.1216800.
EndNote	Parlak B, Aydemir SB (June 1, 2023) Comparative Analysis of Globalisation Techniques for Medical Document Classification. Journal of Soft Computing and Artificial Intelligence 4 1 7–14.
IEEE	B. Parlak and S. B. Aydemir, “Comparative Analysis of Globalisation Techniques for Medical Document Classification”, JSCAI, vol. 4, no. 1, pp. 7–14, 2023, doi: 10.55195/jscai.1216800.
ISNAD	Parlak, Bekir - Aydemir, Salih Berkan. “Comparative Analysis of Globalisation Techniques for Medical Document Classification”. Journal of Soft Computing and Artificial Intelligence 4/1 (June 2023), 7-14. https://doi.org/10.55195/jscai.1216800.
JAMA	Parlak B, Aydemir SB. Comparative Analysis of Globalisation Techniques for Medical Document Classification. JSCAI. 2023;4:7–14.
MLA	Parlak, Bekir and Salih Berkan Aydemir. “Comparative Analysis of Globalisation Techniques for Medical Document Classification”. Journal of Soft Computing and Artificial Intelligence, vol. 4, no. 1, 2023, pp. 7-14, doi:10.55195/jscai.1216800.
Vancouver	Parlak B, Aydemir SB. Comparative Analysis of Globalisation Techniques for Medical Document Classification. JSCAI. 2023;4(1):7-14.

Article Files

Full Text

This work is licensed under a Creative Commons Attribution 4.0 International License.