İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)

Fatih Özyurt; Engin Avcı

Research Article

İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)

Year 2019, Volume: 12 Issue: 1, 30 - 38, 01.06.2019

Fatih Özyurt , Engin Avcı

Abstract

Bu çalışma, Evrişimsel Sinir Ağı (ESA)
kullanarak, imge sınıflandırma süresini azaltan, sınıflandırma performansını
kabul edilebilir değerde tutabilen bir metodu önermektedir. Ayrık Dalgacık
Dönüşümü - Tekil Değer Ayrıştırmaya dayalı algısal özet fonksiyonu kullanarak
Evrişimsel Sinir Ağı (ADD-TDA-ESA) adlı hibrit modelde, sınıflandırma süresini
azaltmak için ESA ile birlikte ADD-TDA tabanlı algısal özet
fonksiyonu kullanılmıştır. Algısal özet fonksiyonlarının en önemli
özelliği imgelerin belirgin özelliklerini elde etmektir. Bu yöntemde, ilk
olarak imgelerin belirgin özelliklerini elde etmek için ADD-TDA algısal özet
fonksiyonu uygulanmıştır. Daha sonra belirgin özelliklerden oluşan 32x32
boyutundaki imgeler ESA’ya girdi olarak verilerek öznitelikler çıkartılıp
Destek Vektör Makinesine sınıflandırma için verilmiştir. ADD-TDA-ESA yöntemi,
Caltech-101 veri tabanında bulunan imgeler için uygulanmıştır. Deneysel
sonuçlar önerilen ADD-TDA-ESA yönteminin %95.8 doğruluk oranına sahip olduğunu
göstermiştir. Ayrıca kullanılan bu yöntem ile klasik yöntemde 241.21 saniye
olan çalışma süresi 83.08 saniyeye düşmüştür. Deney sonuçları ADD-TDA-ESA
yönteminin, imge sınıflandırma doğruluğunu yüksek tutarak klasik ESA’ya
göre çok daha hızlı performans sergilediğini göstermektedir.

Keywords

Derin Öğrenme, Evrişimsel Sinir Ağları, Sınıflandırma, Algısal Özet Fonksiyonu, Ayrık Dalgacık Dönüşümü, Tekil Değer Ayrıştırma

References

[1]. Lowe, D. G. (2004). Distinctive image features from scale-invariant key points. International journal of computer vision, 60(2), 91-110.
[2]. Jain, A. K., Ratha, N. K., & Lakshmanan, S. (1997). Object detection using Gabor filters. Pattern recognition, 30(2), 295-309.
[3]. Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence, 24(7), 971-987.
[4]. J. Virmani, V. Kumar, N. Kalra, N. Khandelwal, SVM-based characterization of liver ultrasound images using wavelet packet texture descriptors, J. Digit. Imaging 26 (3) (2012) 530–543.
[5]. N.K. Jitendra Virmani, Vinod Kumar Naveen Kalra, Prediction of liver cirrhosis based on multiresolution texture descriptors from B-mode ultrasound,Int. J. Converg. Comput. 1 (2013) 1–19.
[6]. U.R. Acharya, H. Fujita, S. Bhat, U. Raghavendra, A. Gudigar, F. Molinari, A. Vijayananthan, K. Hoong Ng, Decision support system for fatty liver diseaseusing GIST descriptors extracted from ultrasound images, Inf. Fusion 29 (2016) 32–39.
[7]. Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.[8] Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504-507.
[8]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[9]. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends® in Machine Learning, 2(1), 1-127.
[10]. Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
[11]. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for visual understanding: A review. Neurocomputing, 187, 27-48.
[12]. Fukushima, K., & Miyake, S. (1982). Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets (pp. 267-285). Springer Berlin Heidelberg.
A. Ng, Sparse autoencoder, CS294A Lecture Notes, vol. 72, 2011
[13]. Salakhutdinov, R., & Hinton, G. (2009, April). Deep boltzmann machines. In Artificial Intelligence and Statistics (pp. 448-455).
[14]. Sutskever, I., Hinton, G. E., & Taylor, G. W. (2009). The recurrent temporal restricted boltzmann machine. In Advances in Neural Information Processing Systems (pp. 1601-1608).
[15]. Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
[16]. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[17]. LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on (Vol. 2, pp. II-104). IEEE.
[18]. Mallat, S. (2012). Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10), 1331-1398.
[19]. Bruna, J., & Mallat, S. (2013). Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1872-1886.
[20]. Zeng, R., Wu, J., Senhadji, L., & Shu, H. (2015, April). Tensor object classification via multilinear discriminant analysis network. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 1971-1975). IEEE.
[21]. Li, S. Z., Yu, B., Wu, W., Su, S. Z., & Ji, R. R. (2015). Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing, 151, 565-573.
[22]. Chan, T. H., Jia, K., Gao, S., Lu, J., Zeng, Z., & Ma, Y. (2015). PCANet: A simple deep learning baseline for image classification?. IEEE Transactions on Image Processing, 24(12), 5017-5032.
[23]. Feng, Z., Jin, L., Tao, D., & Huang, S. (2015). DLANet: A manifold-learning-based discriminative feature learning network for scene classification. Neurocomputing, 157, 11-21.
[24]. Gan, Y., Yang, T., & He, C. (2014, October). A deep graph embedding network model for face recognition. In Signal Processing (ICSP), 2014 12th International Conference on (pp. 1268-1271). IEEE.
[25]. Qin, H., Li, X., Liang, J., Peng, Y., & Zhang, C. (2016). DeepFish: Accurate underwater live fish recognition with a deep architecture. Neurocomputing, 187, 49-58.
[26]. Lei, Z., Yi, D., & Li, S. Z. (2016). Learning stacked image descriptor for face recognition. IEEE Transactions on Circuits and Systems for Video Technology, 26(9), 1685-1696.
[27]. Zhao, Y., Wang, R., Wang, W., & Gao, W. (2016). Multilevel modified finite radon transform network for image upsampling. IEEE Transactions on Circuits and Systems for Video Technology, 26(12), 2189-2199.
[28]. Zeng, R., Wu, J., Shao, Z., Chen, Y., Chen, B., Senhadji, L., & Shu, H. (2016). Color image classification via quaternion principal component analysis network. Neurocomputing, 216, 416-428.
[29]. Tang, Z., Zhang, X., Dai, X., Yang, J., & Wu, T. (2013). Robust image hash function using local color features. AEU-International Journal of Electronics and Communications, 67(8), 717-722.
[30]. Qin, C., Chang, C. C., & Tsou, P. L. (2013). Robust image hashing using non-uniform sampling in discrete Fourier domain. Digital Signal Processing, 23(2), 578-585.
[31]. Tang, Z., Zhang, X., & Zhang, S. (2014). Robust perceptual image hashing based on ring partition and NMF. IEEE Transactions on knowledge and data engineering, 26(3), 711-724.
[32]. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[33]. Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, computing, design, and application. Journal of microbiological methods, 43(1), 3-31.
[34]. Gao, X., Li, W., Loomes, M., & Wang, L. (2017). A fused deep learning architecture for viewpoint classification of echocardiography. Information Fusion, 36, 103-113.
[35]. Fei-Fei, L., Fergus, R., & Perona, P. (2007). Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding, 106(1), 59-70.

Year 2019, Volume: 12 Issue: 1, 30 - 38, 01.06.2019

Fatih Özyurt , Engin Avcı

Abstract

References

[1]. Lowe, D. G. (2004). Distinctive image features from scale-invariant key points. International journal of computer vision, 60(2), 91-110.
[2]. Jain, A. K., Ratha, N. K., & Lakshmanan, S. (1997). Object detection using Gabor filters. Pattern recognition, 30(2), 295-309.
[3]. Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence, 24(7), 971-987.
[4]. J. Virmani, V. Kumar, N. Kalra, N. Khandelwal, SVM-based characterization of liver ultrasound images using wavelet packet texture descriptors, J. Digit. Imaging 26 (3) (2012) 530–543.
[5]. N.K. Jitendra Virmani, Vinod Kumar Naveen Kalra, Prediction of liver cirrhosis based on multiresolution texture descriptors from B-mode ultrasound,Int. J. Converg. Comput. 1 (2013) 1–19.
[6]. U.R. Acharya, H. Fujita, S. Bhat, U. Raghavendra, A. Gudigar, F. Molinari, A. Vijayananthan, K. Hoong Ng, Decision support system for fatty liver diseaseusing GIST descriptors extracted from ultrasound images, Inf. Fusion 29 (2016) 32–39.
[7]. Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.[8] Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504-507.
[8]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[9]. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends® in Machine Learning, 2(1), 1-127.
[10]. Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
[11]. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for visual understanding: A review. Neurocomputing, 187, 27-48.
[12]. Fukushima, K., & Miyake, S. (1982). Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets (pp. 267-285). Springer Berlin Heidelberg.
A. Ng, Sparse autoencoder, CS294A Lecture Notes, vol. 72, 2011
[13]. Salakhutdinov, R., & Hinton, G. (2009, April). Deep boltzmann machines. In Artificial Intelligence and Statistics (pp. 448-455).
[14]. Sutskever, I., Hinton, G. E., & Taylor, G. W. (2009). The recurrent temporal restricted boltzmann machine. In Advances in Neural Information Processing Systems (pp. 1601-1608).
[15]. Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
[16]. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[17]. LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on (Vol. 2, pp. II-104). IEEE.
[18]. Mallat, S. (2012). Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10), 1331-1398.
[19]. Bruna, J., & Mallat, S. (2013). Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1872-1886.
[20]. Zeng, R., Wu, J., Senhadji, L., & Shu, H. (2015, April). Tensor object classification via multilinear discriminant analysis network. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 1971-1975). IEEE.
[21]. Li, S. Z., Yu, B., Wu, W., Su, S. Z., & Ji, R. R. (2015). Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing, 151, 565-573.
[22]. Chan, T. H., Jia, K., Gao, S., Lu, J., Zeng, Z., & Ma, Y. (2015). PCANet: A simple deep learning baseline for image classification?. IEEE Transactions on Image Processing, 24(12), 5017-5032.
[23]. Feng, Z., Jin, L., Tao, D., & Huang, S. (2015). DLANet: A manifold-learning-based discriminative feature learning network for scene classification. Neurocomputing, 157, 11-21.
[24]. Gan, Y., Yang, T., & He, C. (2014, October). A deep graph embedding network model for face recognition. In Signal Processing (ICSP), 2014 12th International Conference on (pp. 1268-1271). IEEE.
[25]. Qin, H., Li, X., Liang, J., Peng, Y., & Zhang, C. (2016). DeepFish: Accurate underwater live fish recognition with a deep architecture. Neurocomputing, 187, 49-58.
[26]. Lei, Z., Yi, D., & Li, S. Z. (2016). Learning stacked image descriptor for face recognition. IEEE Transactions on Circuits and Systems for Video Technology, 26(9), 1685-1696.
[27]. Zhao, Y., Wang, R., Wang, W., & Gao, W. (2016). Multilevel modified finite radon transform network for image upsampling. IEEE Transactions on Circuits and Systems for Video Technology, 26(12), 2189-2199.
[28]. Zeng, R., Wu, J., Shao, Z., Chen, Y., Chen, B., Senhadji, L., & Shu, H. (2016). Color image classification via quaternion principal component analysis network. Neurocomputing, 216, 416-428.
[29]. Tang, Z., Zhang, X., Dai, X., Yang, J., & Wu, T. (2013). Robust image hash function using local color features. AEU-International Journal of Electronics and Communications, 67(8), 717-722.
[30]. Qin, C., Chang, C. C., & Tsou, P. L. (2013). Robust image hashing using non-uniform sampling in discrete Fourier domain. Digital Signal Processing, 23(2), 578-585.
[31]. Tang, Z., Zhang, X., & Zhang, S. (2014). Robust perceptual image hashing based on ring partition and NMF. IEEE Transactions on knowledge and data engineering, 26(3), 711-724.
[32]. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[33]. Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, computing, design, and application. Journal of microbiological methods, 43(1), 3-31.
[34]. Gao, X., Li, W., Loomes, M., & Wang, L. (2017). A fused deep learning architecture for viewpoint classification of echocardiography. Information Fusion, 36, 103-113.
[35]. Fei-Fei, L., Fergus, R., & Perona, P. (2007). Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding, 106(1), 59-70.

There are 36 citations in total.

Details

Primary Language	Turkish
Subjects	Engineering
Journal Section	Makaleler(Araştırma)
Authors	Fatih Özyurt 0000-0002-8154-6691 Engin Avcı
Publication Date	June 1, 2019
Published in Issue	Year 2019 Volume: 12 Issue: 1

Cite

APA	Özyurt, F., & Avcı, E. (2019). İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa). Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi, 12(1), 30-38.
AMA	Özyurt F, Avcı E. İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa). TBV-BBMD. June 2019;12(1):30-38.
Chicago	Özyurt, Fatih, and Engin Avcı. “İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi 12, no. 1 (June 2019): 30-38.
EndNote	Özyurt F, Avcı E (June 1, 2019) İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa). Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 12 1 30–38.
IEEE	F. Özyurt and E. Avcı, “İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)”, TBV-BBMD, vol. 12, no. 1, pp. 30–38, 2019.
ISNAD	Özyurt, Fatih - Avcı, Engin. “İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 12/1 (June 2019), 30-38.
JAMA	Özyurt F, Avcı E. İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa). TBV-BBMD. 2019;12:30–38.
MLA	Özyurt, Fatih and Engin Avcı. “İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi, vol. 12, no. 1, 2019, pp. 30-38.
Vancouver	Özyurt F, Avcı E. İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa). TBV-BBMD. 2019;12(1):30-8.

Download Cover Image

Article Files

Full Text

Article Acceptance

Use user registration/login to upload articles online.

The acceptance process of the articles sent to the journal consists of the following stages:

1. Each submitted article is sent to at least two referees at the first stage.

2. Referee appointments are made by the journal editors. There are approximately 200 referees in the referee pool of the journal and these referees are classified according to their areas of interest. Each referee is sent an article on the subject he is interested in. The selection of the arbitrator is done in a way that does not cause any conflict of interest.

3. In the articles sent to the referees, the names of the authors are closed.

4. Referees are explained how to evaluate an article and are asked to fill in the evaluation form shown below.

5. The articles in which two referees give positive opinion are subjected to similarity review by the editors. The similarity in the articles is expected to be less than 25%.

6. A paper that has passed all stages is reviewed by the editor in terms of language and presentation, and necessary corrections and improvements are made. If necessary, the authors are notified of the situation.

. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.