Gastrointestinal (GI) diseases are a major issue in the human digestive system. Therefore, many studies have explored the automatic classification of GI diseases to reduce the burden on clinicians and improve patient outcomes for both diagnosis and treatment purposes. Convolutional neural networks (CNNs) and Vision Transformers (ViTs) in deep learning approaches have become a popular research area for the automatic detection of diseases from medical images. This study evaluated the classification performance of thirteen different CNN models and two different ViT architectures on endoscopic images. The impact of transfer learning parameters on classification performance was also observed. The tests revealed that the classification accuracies of the ViT models were 91.25% and 90.50%, respectively. In contrast, the DenseNet201 architecture, with optimized transfer learning parameters, achieved an accuracy of 93.13%, recall of 93.17%, precision of 93.13%, and an F1 score of 93.11%, making it the most successful model among all the others. Considering the results, it is evident that a well-optimized CNN model achieved better classification performance than the ViT models.
Medical Image Classification Convolutional Neural Networks Vision Transformers Fine Tuning Transfer Learning Gastrointestinal Diseases
Primary Language | English |
---|---|
Subjects | Computer Software, Software Engineering (Other) |
Journal Section | Bilgisayar Mühendisliği / Computer Engineering |
Authors | |
Early Pub Date | August 27, 2024 |
Publication Date | September 1, 2024 |
Submission Date | June 15, 2024 |
Acceptance Date | July 21, 2024 |
Published in Issue | Year 2024 Volume: 14 Issue: 3 |