İmge Sınıflandırması için Yeni Öznitelik Çıkarım Yöntemi: Add-Tda Algısal Özet Fonksiyonu Tabanlı Evrişimsel Sinir Ağ (Add-Tda-Esa)
Year 2019,
Volume: 12 Issue: 1, 30 - 38, 01.06.2019
Fatih Özyurt
,
Engin Avcı
Abstract
Bu çalışma, Evrişimsel Sinir Ağı (ESA)
kullanarak, imge sınıflandırma süresini azaltan, sınıflandırma performansını
kabul edilebilir değerde tutabilen bir metodu önermektedir. Ayrık Dalgacık
Dönüşümü - Tekil Değer Ayrıştırmaya dayalı algısal özet fonksiyonu kullanarak
Evrişimsel Sinir Ağı (ADD-TDA-ESA) adlı hibrit modelde, sınıflandırma süresini
azaltmak için ESA ile birlikte ADD-TDA tabanlı algısal özet
fonksiyonu kullanılmıştır. Algısal özet fonksiyonlarının en önemli
özelliği imgelerin belirgin özelliklerini elde etmektir. Bu yöntemde, ilk
olarak imgelerin belirgin özelliklerini elde etmek için ADD-TDA algısal özet
fonksiyonu uygulanmıştır. Daha sonra belirgin özelliklerden oluşan 32x32
boyutundaki imgeler ESA’ya girdi olarak verilerek öznitelikler çıkartılıp
Destek Vektör Makinesine sınıflandırma için verilmiştir. ADD-TDA-ESA yöntemi,
Caltech-101 veri tabanında bulunan imgeler için uygulanmıştır. Deneysel
sonuçlar önerilen ADD-TDA-ESA yönteminin %95.8 doğruluk oranına sahip olduğunu
göstermiştir. Ayrıca kullanılan bu yöntem ile klasik yöntemde 241.21 saniye
olan çalışma süresi 83.08 saniyeye düşmüştür. Deney sonuçları ADD-TDA-ESA
yönteminin, imge sınıflandırma doğruluğunu yüksek tutarak klasik ESA’ya
göre çok daha hızlı performans sergilediğini göstermektedir.
References
- [1]. Lowe, D. G. (2004). Distinctive image features from scale-invariant key points. International journal of computer vision, 60(2), 91-110.
- [2]. Jain, A. K., Ratha, N. K., & Lakshmanan, S. (1997). Object detection using Gabor filters. Pattern recognition, 30(2), 295-309.
- [3]. Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence, 24(7), 971-987.
- [4]. J. Virmani, V. Kumar, N. Kalra, N. Khandelwal, SVM-based characterization of liver ultrasound images using wavelet packet texture descriptors, J. Digit. Imaging 26 (3) (2012) 530–543.
- [5]. N.K. Jitendra Virmani, Vinod Kumar Naveen Kalra, Prediction of liver cirrhosis based on multiresolution texture descriptors from B-mode ultrasound,Int. J. Converg. Comput. 1 (2013) 1–19.
- [6]. U.R. Acharya, H. Fujita, S. Bhat, U. Raghavendra, A. Gudigar, F. Molinari, A. Vijayananthan, K. Hoong Ng, Decision support system for fatty liver diseaseusing GIST descriptors extracted from ultrasound images, Inf. Fusion 29 (2016) 32–39.
- [7]. Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.[8] Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504-507.
- [8]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
- [9]. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends® in Machine Learning, 2(1), 1-127.
- [10]. Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
- [11]. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for visual understanding: A review. Neurocomputing, 187, 27-48.
- [12]. Fukushima, K., & Miyake, S. (1982). Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets (pp. 267-285). Springer Berlin Heidelberg.
- A. Ng, Sparse autoencoder, CS294A Lecture Notes, vol. 72, 2011
- [13]. Salakhutdinov, R., & Hinton, G. (2009, April). Deep boltzmann machines. In Artificial Intelligence and Statistics (pp. 448-455).
- [14]. Sutskever, I., Hinton, G. E., & Taylor, G. W. (2009). The recurrent temporal restricted boltzmann machine. In Advances in Neural Information Processing Systems (pp. 1601-1608).
- [15]. Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
- [16]. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
- [17]. LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on (Vol. 2, pp. II-104). IEEE.
- [18]. Mallat, S. (2012). Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10), 1331-1398.
- [19]. Bruna, J., & Mallat, S. (2013). Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1872-1886.
- [20]. Zeng, R., Wu, J., Senhadji, L., & Shu, H. (2015, April). Tensor object classification via multilinear discriminant analysis network. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 1971-1975). IEEE.
- [21]. Li, S. Z., Yu, B., Wu, W., Su, S. Z., & Ji, R. R. (2015). Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing, 151, 565-573.
- [22]. Chan, T. H., Jia, K., Gao, S., Lu, J., Zeng, Z., & Ma, Y. (2015). PCANet: A simple deep learning baseline for image classification?. IEEE Transactions on Image Processing, 24(12), 5017-5032.
- [23]. Feng, Z., Jin, L., Tao, D., & Huang, S. (2015). DLANet: A manifold-learning-based discriminative feature learning network for scene classification. Neurocomputing, 157, 11-21.
- [24]. Gan, Y., Yang, T., & He, C. (2014, October). A deep graph embedding network model for face recognition. In Signal Processing (ICSP), 2014 12th International Conference on (pp. 1268-1271). IEEE.
- [25]. Qin, H., Li, X., Liang, J., Peng, Y., & Zhang, C. (2016). DeepFish: Accurate underwater live fish recognition with a deep architecture. Neurocomputing, 187, 49-58.
- [26]. Lei, Z., Yi, D., & Li, S. Z. (2016). Learning stacked image descriptor for face recognition. IEEE Transactions on Circuits and Systems for Video Technology, 26(9), 1685-1696.
- [27]. Zhao, Y., Wang, R., Wang, W., & Gao, W. (2016). Multilevel modified finite radon transform network for image upsampling. IEEE Transactions on Circuits and Systems for Video Technology, 26(12), 2189-2199.
- [28]. Zeng, R., Wu, J., Shao, Z., Chen, Y., Chen, B., Senhadji, L., & Shu, H. (2016). Color image classification via quaternion principal component analysis network. Neurocomputing, 216, 416-428.
- [29]. Tang, Z., Zhang, X., Dai, X., Yang, J., & Wu, T. (2013). Robust image hash function using local color features. AEU-International Journal of Electronics and Communications, 67(8), 717-722.
- [30]. Qin, C., Chang, C. C., & Tsou, P. L. (2013). Robust image hashing using non-uniform sampling in discrete Fourier domain. Digital Signal Processing, 23(2), 578-585.
- [31]. Tang, Z., Zhang, X., & Zhang, S. (2014). Robust perceptual image hashing based on ring partition and NMF. IEEE Transactions on knowledge and data engineering, 26(3), 711-724.
- [32]. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
- [33]. Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, computing, design, and application. Journal of microbiological methods, 43(1), 3-31.
- [34]. Gao, X., Li, W., Loomes, M., & Wang, L. (2017). A fused deep learning architecture for viewpoint classification of echocardiography. Information Fusion, 36, 103-113.
- [35]. Fei-Fei, L., Fergus, R., & Perona, P. (2007). Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding, 106(1), 59-70.
Year 2019,
Volume: 12 Issue: 1, 30 - 38, 01.06.2019
Fatih Özyurt
,
Engin Avcı
References
- [1]. Lowe, D. G. (2004). Distinctive image features from scale-invariant key points. International journal of computer vision, 60(2), 91-110.
- [2]. Jain, A. K., Ratha, N. K., & Lakshmanan, S. (1997). Object detection using Gabor filters. Pattern recognition, 30(2), 295-309.
- [3]. Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on pattern analysis and machine intelligence, 24(7), 971-987.
- [4]. J. Virmani, V. Kumar, N. Kalra, N. Khandelwal, SVM-based characterization of liver ultrasound images using wavelet packet texture descriptors, J. Digit. Imaging 26 (3) (2012) 530–543.
- [5]. N.K. Jitendra Virmani, Vinod Kumar Naveen Kalra, Prediction of liver cirrhosis based on multiresolution texture descriptors from B-mode ultrasound,Int. J. Converg. Comput. 1 (2013) 1–19.
- [6]. U.R. Acharya, H. Fujita, S. Bhat, U. Raghavendra, A. Gudigar, F. Molinari, A. Vijayananthan, K. Hoong Ng, Decision support system for fatty liver diseaseusing GIST descriptors extracted from ultrasound images, Inf. Fusion 29 (2016) 32–39.
- [7]. Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.[8] Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504-507.
- [8]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
- [9]. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends® in Machine Learning, 2(1), 1-127.
- [10]. Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
- [11]. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for visual understanding: A review. Neurocomputing, 187, 27-48.
- [12]. Fukushima, K., & Miyake, S. (1982). Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets (pp. 267-285). Springer Berlin Heidelberg.
- A. Ng, Sparse autoencoder, CS294A Lecture Notes, vol. 72, 2011
- [13]. Salakhutdinov, R., & Hinton, G. (2009, April). Deep boltzmann machines. In Artificial Intelligence and Statistics (pp. 448-455).
- [14]. Sutskever, I., Hinton, G. E., & Taylor, G. W. (2009). The recurrent temporal restricted boltzmann machine. In Advances in Neural Information Processing Systems (pp. 1601-1608).
- [15]. Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
- [16]. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
- [17]. LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on (Vol. 2, pp. II-104). IEEE.
- [18]. Mallat, S. (2012). Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10), 1331-1398.
- [19]. Bruna, J., & Mallat, S. (2013). Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1872-1886.
- [20]. Zeng, R., Wu, J., Senhadji, L., & Shu, H. (2015, April). Tensor object classification via multilinear discriminant analysis network. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 1971-1975). IEEE.
- [21]. Li, S. Z., Yu, B., Wu, W., Su, S. Z., & Ji, R. R. (2015). Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing, 151, 565-573.
- [22]. Chan, T. H., Jia, K., Gao, S., Lu, J., Zeng, Z., & Ma, Y. (2015). PCANet: A simple deep learning baseline for image classification?. IEEE Transactions on Image Processing, 24(12), 5017-5032.
- [23]. Feng, Z., Jin, L., Tao, D., & Huang, S. (2015). DLANet: A manifold-learning-based discriminative feature learning network for scene classification. Neurocomputing, 157, 11-21.
- [24]. Gan, Y., Yang, T., & He, C. (2014, October). A deep graph embedding network model for face recognition. In Signal Processing (ICSP), 2014 12th International Conference on (pp. 1268-1271). IEEE.
- [25]. Qin, H., Li, X., Liang, J., Peng, Y., & Zhang, C. (2016). DeepFish: Accurate underwater live fish recognition with a deep architecture. Neurocomputing, 187, 49-58.
- [26]. Lei, Z., Yi, D., & Li, S. Z. (2016). Learning stacked image descriptor for face recognition. IEEE Transactions on Circuits and Systems for Video Technology, 26(9), 1685-1696.
- [27]. Zhao, Y., Wang, R., Wang, W., & Gao, W. (2016). Multilevel modified finite radon transform network for image upsampling. IEEE Transactions on Circuits and Systems for Video Technology, 26(12), 2189-2199.
- [28]. Zeng, R., Wu, J., Shao, Z., Chen, Y., Chen, B., Senhadji, L., & Shu, H. (2016). Color image classification via quaternion principal component analysis network. Neurocomputing, 216, 416-428.
- [29]. Tang, Z., Zhang, X., Dai, X., Yang, J., & Wu, T. (2013). Robust image hash function using local color features. AEU-International Journal of Electronics and Communications, 67(8), 717-722.
- [30]. Qin, C., Chang, C. C., & Tsou, P. L. (2013). Robust image hashing using non-uniform sampling in discrete Fourier domain. Digital Signal Processing, 23(2), 578-585.
- [31]. Tang, Z., Zhang, X., & Zhang, S. (2014). Robust perceptual image hashing based on ring partition and NMF. IEEE Transactions on knowledge and data engineering, 26(3), 711-724.
- [32]. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
- [33]. Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, computing, design, and application. Journal of microbiological methods, 43(1), 3-31.
- [34]. Gao, X., Li, W., Loomes, M., & Wang, L. (2017). A fused deep learning architecture for viewpoint classification of echocardiography. Information Fusion, 36, 103-113.
- [35]. Fei-Fei, L., Fergus, R., & Perona, P. (2007). Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding, 106(1), 59-70.