Hipotiroidi Hastalığı Teşhisinde Sınıflandırma Algoritmalarının Kullanımı

Göksu Akgül; Ali Akın Çelik; Zeliha Ergül Aydın; Zehra Kamışlı Öztürk

doi:10.17671/gazibtd.710728

Research Article

Use of Classification Algorithms in Diagnosis of Hypothyroidism

Year 2020, Volume: 13 Issue: 3, 255 - 268, 31.07.2020

Göksu Akgül , Ali Akın Çelik , Zeliha Ergül Aydın , Zehra Kamışlı Öztürk

https://doi.org/10.17671/gazibtd.710728

Cited By: 6

Abstract

Disease diagnosis is one of the most important problems encountered in the medical field. Different types of a specific disease and similar symptoms with other diseases make the disease harder to diagnose. For these reasons Hypothyroidism, which is one of the types of thyroid disease, is a disease that decreases patient's quality of life due to the delay in its diagnosis. The purpose of this article is to propose a data mining-based system that will increase the correct diagnosis of hypothyroidism rate by using the question asked to the patients during the diagnosis process, and the test results applied. The other aim is to reduce the complications that may arise from interventional tests used indirectly for diagnosis. For these purposes, it was estimated whether new samples were hypothyroidism by using a data set consisting of 3163 samples in the UCI machine learning database, 151 of which were hypothyroid and the rest without hypothyroidism. In order to deal with the imbalanced class distribution in the data, different sampling techniques were applied to the data set and models to diagnose hypothyroidism with Logistic Regression, K Nearest Neighbor, and Support Vector Machine classifiers were created. With this aspect, the study demonstrated the effect of sampling methods on the diagnosis of hypothyroid disease. Among the developed models, the Logistics Regression classifier, which was trained with the data set applied to the oversampling techniques, gave the highest performance. The best results obtained with this classifier are 97.8% for accuracy rate, 82.26% for F-Score value, 93.2% for area under the curve and 81.8% for Matthews correlation coefficient.

Keywords

Disease diagnosis, Hypothyroidism, Data Mining, Logistic Regression, K Nearest Neighborhood, Support Vector Machine

References

B. Çakır, F. Sağlam, “Birinci Basamakta Tiroid Hastalıklarına Klinik Yaklaşım”, Ankara Medical Journal, 12(3), 136-139, 2012.
K. Yılancıoğlu, “Vocal Cord Measures Based Artificial Neural Network Approach for Prediction of Parkinson’ s Disease Status”, SDÜ Sağlık Bilimleri Enstitüsü Dergisi, 8(2), 8-11, 2017.
Internet: UCI Machine Learning Repository, https://archive.ics.uci.edu/ml/datasets/Thyroid+Disease, 09.2019.
E. Kaya, M. Bulun, A. Arslan, “Tıpta Veri Ambarları Oluşturma ve Veri Madenciliği Uygulamaları”, Akademik Bilişim 2003, Adana, 2003.
Ö. Demir, B. Doğan, E. Ç. Bayezit, K. Yıldız, “Automatic Detection and Calculation of Drusen Areas in Retinal Fundus Fluorescein Angiography Images”, Marmara Fen Bilimleri Dergisi, 2, 128-132, 2019.
A. Buldu, K. Yıldız, E. E. Ülkü, Ö. Demir, U. Kurgan, “Data Collection from Blood Glucose Meter and Anomaly Detection”, Karaelmas Fen ve Mühendislik Dergisi, 7(2), 428-433, 2017.
Z. Chiara, “Data Mining in Bioinformatics”, Encyclopedia of Bioinformatics and Computational Biology, 328-335,2019.
M. Sert, “Feature Selection for Obstructive Sleep Apnea Recognition”, Bilişim Teknolojileri Dergisi, 12(4), 333-342, 2019.
N. Alpaslan, “Meme Kanseri Tanısı için Derin Öznitelik Tabanlı Karar Destek Sistemi”, Selçuk Üniversitesi Mühendislik, Bilim Ve Teknoloji Dergisi, 7(1), 213-227, 2019.
M. A. Pala, M. E. Çimen, Ö. F. Boyraz, M. Z. Yıldız, A. F. Boz, “Meme Kanserinin Teşhis Edilmesinde Karar Ağacı Ve KNN Algoritmalarının Karşılaştırmalı Başarım Analizi”, 7th International Symposium on Innovative Technologies in Engineering and Science, Şanlıurfa, 2019.
S. Bang, S. Son, H. Roh, J. Lee, S. Bae, K. Lee, C. Hong, H. Shin, “Quad-Phased Data Mining Modeling for Dementia Diagnosis”, BMC Medical Informatics and Decision Making, 17(60), 2017.
M. Shouman, T. Turner, R. Stocker, “Using data mining techniques in heart disease diagnosis and treatment”, in 2012 Japan-Egypt Conference on Electronics, Communications and Computers, Alexandria, 2012.
F. C. D. Q. Mello, L. G. d. V. Bastos, S. L. M. Soares, V. MC Rezende, M. B. Conde, R. E. Chaisson, A. L. Kritski, A. R. -Netto, G. L. Werneck, “Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study”, BMC Public Health, 6(43), 2006.
S. Kılıçarslan, K. Adem, O. Cömert, “Parçacık Sürü Optimizasyonu Kullanılarak Boyutu Azaltılmış Mikrodizi Verileri Üzerinde Makine Öğrenmesi Yöntemleri ile Prostat Kanseri Teşhisi”, Düzce Üniversitesi Bilim ve Teknoloji Dergisi, cilt 7, 769-777, 2019.
B. O. Yolcular, U. Bilge, M. K. Samur, “Extracting Association Rules from Turkish Otorhinolaryngology Discharge Summaries”, Bilişim Teknolojileri Dergisi, 11(1), 35-42, 2018.
S. Dash, M. N. Das, B. K. Mishra, “Implementation of an optimized classification model for prediction of hypothyroid disease risks”, 2016 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, 2016.
İ. Türkoğlu, Ş. Doğan, “Hypothyroidi and Hyperthyroidi Detection from Thyroid Hormone Parametersby Using Decision Trees”, Doğu Anadolu Bölgesi Araştırmaları Dergisi, 5(2), 163-169, 2007.
W.-C. Yeh, “Novel swarm optimization for mining classification rules on thyroid gland data”, Information Sciences, 197, 65-76, 2012.
Y. Kaya, “Fast Intelligent Diagnosis System For Thyroid Disases Based On Extreme Learning Machine”, Anadolu University Journal of Science and Technology A- Applied Sciences and Engineering, 15(1), 41-49, 2014.
M. Deepika, K. Kalaiselvi, “A Empirical study on Disease Diagnosis using Data Mining Techniques”, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, 2018.
N.A. Sajadia, S. Borzouei, H. Mahjub, M. Farhadian, “Diagnosis of hypothyroidism using a fuzzy rule-based expert system”, Clinical Epidemiology and Global Health, 7(4), 519-524, 2019.
U. Fayyad, “Data Mining and Knowledge Discovery in Databases: Implications for scientific databases”, Proc. of the 9 th Int Conf on Scientific and Statistical Database Management, Olympia, Washington, USA, 1997.
P. Giudici, Applied Data Mining: Statistical Methods for Business and Industry, New York: John Wiley, 2003.
N. A. Sundar, P. P. Latha, M. R. Chandra, “Performance Analysis Of Classification Data Mining Techniques Over Heart Disease Data Base”, International Journal of Engineering Science & Advanced Technology, 2(3), 470-478, 2012.
H. Bircan, “Lojistik Regresyon Analizi: Tıp Verileri Üzerine Bir Uygulama”, Kocaeli Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, cilt 2, 185-208, 2004.
Internet: Imbalanced-learn, https://imbalancedlearn.readthedocs.io/en/stable/api.html, 01.2020.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, “Scikit-learn: Machine Learning in Python”, Journal of Machine Learning Research, cilt 12, 2825-2830, 2011.
Y. Liu, X. Yu, J. X. Huang, A. An, “Combining Integrated Sampling with Svm Ensembles for Learning from Imbalanced Datasets”, Information Processing & Management, 47(4), 617-631, 2011.
M. Eminağaoğlu, A. Vahaplar, “Turnaround Time Prediction for a Medical Laboratory Using Artificial Neural Networks”, Bilişim Teknolojileri Dergisi, 11(4), 357-368, 2018.
W. Ahmad, A. Ahmad, C. Lu, B.A. Khoso, L. Huang, “A novel hybrid decision support system for thyroid disease forecasting”, Soft Computing, 22, 5377-5383, 2018.
G. Serpen, H. Jiang, L. Allred, “Performance analysis of probabilistic potential function neural network classifier” In: Proceedings of artificial neural networks in engineering conference, St. Louis, MO, USA. Citeseer, 471–476, 1997.
L. Özyılmaz, T. Yıldırım, “Diagnosis of thyroid disease using artificial neural network methods”, In: Proceedings of the 9th international conference on neural information processing, 2002. ICONIP’02 2002. IEEE, 2033–2036, 2002.
L. Pasi, “Similarity classifier applied to medical data sets, 2004, 10 sivua, Fuzziness in Finland’04”. In: International conference on soft computing, Helsinki, Finland & Gulf of Finland & Tallinn, Estonia, 2004.
K. Polat, S. Güneş, “A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system”, Digit Signal Proc, 16, 913–921, 2007.
F. Temurtas, “A comparative study on thyroid disease diagnosis using neural networks”, Expert Systems with Applications, 36, 944–949, 2009.
X. Liu, X. Wang, Q. Su, M. Zhang, Y. Zhu, Q. Wang, Q. Wang, “A hybrid classification system for heart disease diagnosis based on the RFRS method”, Computational and Mathematical Methods in Medicine, 2017, https://doi.org/ 10.1155/2017/8272091, 2017.
N.M. Sundaram, V. Renupriya, “Artificial neural network classifiers for diagnosis of thyroid abnormalities”. In: International conference on systems, science, control, communication, engineering and technology, 206–211, 2016.
N. Rajkumar, J. Palanichamy J. “Optimized construction of various classification models for the diagnosis of thyroid problems in human beings”, Kuwait Journal of Science, 42, 198–205, 2015.

Hipotiroidi Hastalığı Teşhisinde Sınıflandırma Algoritmalarının Kullanımı

Year 2020, Volume: 13 Issue: 3, 255 - 268, 31.07.2020

Göksu Akgül , Ali Akın Çelik , Zeliha Ergül Aydın , Zehra Kamışlı Öztürk

https://doi.org/10.17671/gazibtd.710728

Cited By: 6

Abstract

Hastalık teşhisi, tıp alanında karşılaşılan en önemli problemlerden biridir. Belirli bir hastalığın farklı türlerinin ve diğer hastalıklarla benzer semptomlarının olması hastalığın teşhisini zorlaştırmaktadır. Tiroit hastalığı çeşitlerinden biri olan hipotiroidi de bu sebeplerle teşhisi geciken ve hastaların yaşam kalitesini düşüren bir hastalıktır. Bu çalışmanın amacı, tanı sürecinde hastalara sorulan soru ve uygulanan test sonuçlarını kullanarak hipotiroidi hastalığının doğru teşhis oranını arttıracak veri madenciliği temelli bir sistem önermektir. Diğer amaç ise dolaylı olarak teşhis için kullanılan girişimsel testlerden oluşabilecek komplikasyonları azaltmaktır. Bu amaçlar doğrultusunda UCI makine öğrenmesi veri tabanında yer alan ve 151 tanesi hipotiroidi geri kalanı hipotiroidi olmayan toplam 3163 örnekten oluşan veri seti kullanılarak yeni örneklerin hipotiroidi olup olmadığı tahmin edilmiştir. Veri setindeki dengesiz dağılımı ortadan kaldırmak için veri setine farklı örnekleme teknikleri uygulanarak Lojistik Regresyon, K En Yakın Komşu ve Destek Vektör Makinesi sınıflandırıcıları ile hipotiroidi hastalığını teşhis edecek modeller oluşturulmuştur. Bu yönüyle, çalışma örnekleme yöntemlerinin hipotiroidi hastalığı teşhisi üzerindeki etkisini göstermiştir. Geliştirilen modeller içinde en yüksek performansı, aşırı örnekleme teknikleri uygulanan veri seti ile eğitilen Lojistik Regresyon sınıflandırıcısı vermiştir. Bu sınıflandırıcı ile elde edilen en iyi sonuçlar; doğruluk oranı için %97.8, F-Skor değeri için %82.26, eğri altında kalan alan için %93.2 ve Matthews korelasyon katsayısı için de %81.8’dir.

Keywords

Hastalık teşhisi, Hipotiroidi, Veri Madenciliği, Lojistik Regresyon, K En Yakın Komşu, Destek Vektör Makineleri

References

B. Çakır, F. Sağlam, “Birinci Basamakta Tiroid Hastalıklarına Klinik Yaklaşım”, Ankara Medical Journal, 12(3), 136-139, 2012.
K. Yılancıoğlu, “Vocal Cord Measures Based Artificial Neural Network Approach for Prediction of Parkinson’ s Disease Status”, SDÜ Sağlık Bilimleri Enstitüsü Dergisi, 8(2), 8-11, 2017.
Internet: UCI Machine Learning Repository, https://archive.ics.uci.edu/ml/datasets/Thyroid+Disease, 09.2019.
E. Kaya, M. Bulun, A. Arslan, “Tıpta Veri Ambarları Oluşturma ve Veri Madenciliği Uygulamaları”, Akademik Bilişim 2003, Adana, 2003.
Ö. Demir, B. Doğan, E. Ç. Bayezit, K. Yıldız, “Automatic Detection and Calculation of Drusen Areas in Retinal Fundus Fluorescein Angiography Images”, Marmara Fen Bilimleri Dergisi, 2, 128-132, 2019.
A. Buldu, K. Yıldız, E. E. Ülkü, Ö. Demir, U. Kurgan, “Data Collection from Blood Glucose Meter and Anomaly Detection”, Karaelmas Fen ve Mühendislik Dergisi, 7(2), 428-433, 2017.
Z. Chiara, “Data Mining in Bioinformatics”, Encyclopedia of Bioinformatics and Computational Biology, 328-335,2019.
M. Sert, “Feature Selection for Obstructive Sleep Apnea Recognition”, Bilişim Teknolojileri Dergisi, 12(4), 333-342, 2019.
N. Alpaslan, “Meme Kanseri Tanısı için Derin Öznitelik Tabanlı Karar Destek Sistemi”, Selçuk Üniversitesi Mühendislik, Bilim Ve Teknoloji Dergisi, 7(1), 213-227, 2019.
M. A. Pala, M. E. Çimen, Ö. F. Boyraz, M. Z. Yıldız, A. F. Boz, “Meme Kanserinin Teşhis Edilmesinde Karar Ağacı Ve KNN Algoritmalarının Karşılaştırmalı Başarım Analizi”, 7th International Symposium on Innovative Technologies in Engineering and Science, Şanlıurfa, 2019.
S. Bang, S. Son, H. Roh, J. Lee, S. Bae, K. Lee, C. Hong, H. Shin, “Quad-Phased Data Mining Modeling for Dementia Diagnosis”, BMC Medical Informatics and Decision Making, 17(60), 2017.
M. Shouman, T. Turner, R. Stocker, “Using data mining techniques in heart disease diagnosis and treatment”, in 2012 Japan-Egypt Conference on Electronics, Communications and Computers, Alexandria, 2012.
F. C. D. Q. Mello, L. G. d. V. Bastos, S. L. M. Soares, V. MC Rezende, M. B. Conde, R. E. Chaisson, A. L. Kritski, A. R. -Netto, G. L. Werneck, “Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study”, BMC Public Health, 6(43), 2006.
S. Kılıçarslan, K. Adem, O. Cömert, “Parçacık Sürü Optimizasyonu Kullanılarak Boyutu Azaltılmış Mikrodizi Verileri Üzerinde Makine Öğrenmesi Yöntemleri ile Prostat Kanseri Teşhisi”, Düzce Üniversitesi Bilim ve Teknoloji Dergisi, cilt 7, 769-777, 2019.
B. O. Yolcular, U. Bilge, M. K. Samur, “Extracting Association Rules from Turkish Otorhinolaryngology Discharge Summaries”, Bilişim Teknolojileri Dergisi, 11(1), 35-42, 2018.
S. Dash, M. N. Das, B. K. Mishra, “Implementation of an optimized classification model for prediction of hypothyroid disease risks”, 2016 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, 2016.
İ. Türkoğlu, Ş. Doğan, “Hypothyroidi and Hyperthyroidi Detection from Thyroid Hormone Parametersby Using Decision Trees”, Doğu Anadolu Bölgesi Araştırmaları Dergisi, 5(2), 163-169, 2007.
W.-C. Yeh, “Novel swarm optimization for mining classification rules on thyroid gland data”, Information Sciences, 197, 65-76, 2012.
Y. Kaya, “Fast Intelligent Diagnosis System For Thyroid Disases Based On Extreme Learning Machine”, Anadolu University Journal of Science and Technology A- Applied Sciences and Engineering, 15(1), 41-49, 2014.
M. Deepika, K. Kalaiselvi, “A Empirical study on Disease Diagnosis using Data Mining Techniques”, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, 2018.
N.A. Sajadia, S. Borzouei, H. Mahjub, M. Farhadian, “Diagnosis of hypothyroidism using a fuzzy rule-based expert system”, Clinical Epidemiology and Global Health, 7(4), 519-524, 2019.
U. Fayyad, “Data Mining and Knowledge Discovery in Databases: Implications for scientific databases”, Proc. of the 9 th Int Conf on Scientific and Statistical Database Management, Olympia, Washington, USA, 1997.
P. Giudici, Applied Data Mining: Statistical Methods for Business and Industry, New York: John Wiley, 2003.
N. A. Sundar, P. P. Latha, M. R. Chandra, “Performance Analysis Of Classification Data Mining Techniques Over Heart Disease Data Base”, International Journal of Engineering Science & Advanced Technology, 2(3), 470-478, 2012.
H. Bircan, “Lojistik Regresyon Analizi: Tıp Verileri Üzerine Bir Uygulama”, Kocaeli Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, cilt 2, 185-208, 2004.
Internet: Imbalanced-learn, https://imbalancedlearn.readthedocs.io/en/stable/api.html, 01.2020.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, “Scikit-learn: Machine Learning in Python”, Journal of Machine Learning Research, cilt 12, 2825-2830, 2011.
Y. Liu, X. Yu, J. X. Huang, A. An, “Combining Integrated Sampling with Svm Ensembles for Learning from Imbalanced Datasets”, Information Processing & Management, 47(4), 617-631, 2011.
M. Eminağaoğlu, A. Vahaplar, “Turnaround Time Prediction for a Medical Laboratory Using Artificial Neural Networks”, Bilişim Teknolojileri Dergisi, 11(4), 357-368, 2018.
W. Ahmad, A. Ahmad, C. Lu, B.A. Khoso, L. Huang, “A novel hybrid decision support system for thyroid disease forecasting”, Soft Computing, 22, 5377-5383, 2018.
G. Serpen, H. Jiang, L. Allred, “Performance analysis of probabilistic potential function neural network classifier” In: Proceedings of artificial neural networks in engineering conference, St. Louis, MO, USA. Citeseer, 471–476, 1997.
L. Özyılmaz, T. Yıldırım, “Diagnosis of thyroid disease using artificial neural network methods”, In: Proceedings of the 9th international conference on neural information processing, 2002. ICONIP’02 2002. IEEE, 2033–2036, 2002.
L. Pasi, “Similarity classifier applied to medical data sets, 2004, 10 sivua, Fuzziness in Finland’04”. In: International conference on soft computing, Helsinki, Finland & Gulf of Finland & Tallinn, Estonia, 2004.
K. Polat, S. Güneş, “A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system”, Digit Signal Proc, 16, 913–921, 2007.
F. Temurtas, “A comparative study on thyroid disease diagnosis using neural networks”, Expert Systems with Applications, 36, 944–949, 2009.
X. Liu, X. Wang, Q. Su, M. Zhang, Y. Zhu, Q. Wang, Q. Wang, “A hybrid classification system for heart disease diagnosis based on the RFRS method”, Computational and Mathematical Methods in Medicine, 2017, https://doi.org/ 10.1155/2017/8272091, 2017.
N.M. Sundaram, V. Renupriya, “Artificial neural network classifiers for diagnosis of thyroid abnormalities”. In: International conference on systems, science, control, communication, engineering and technology, 206–211, 2016.
N. Rajkumar, J. Palanichamy J. “Optimized construction of various classification models for the diagnosis of thyroid problems in human beings”, Kuwait Journal of Science, 42, 198–205, 2015.

There are 38 citations in total.

Details

Primary Language	Turkish
Subjects	Computer Software
Journal Section	Articles
Authors	Göksu Akgül Ali Akın Çelik Zeliha Ergül Aydın Zehra Kamışlı Öztürk
Publication Date	July 31, 2020
Submission Date	March 28, 2020
Published in Issue	Year 2020 Volume: 13 Issue: 3

Cite

APA	Akgül, G., Çelik, A. A., Ergül Aydın, Z., Kamışlı Öztürk, Z. (2020). Hipotiroidi Hastalığı Teşhisinde Sınıflandırma Algoritmalarının Kullanımı. Bilişim Teknolojileri Dergisi, 13(3), 255-268. https://doi.org/10.17671/gazibtd.710728