Classification of Animals with Different Deep Learning Models

Özkan İnik; Bulent Turan

Research Article

Classification of Animals with Different Deep Learning Models

Year 2018, Volume: 7 Issue: 1, 9 - 16, 20.04.2018

Özkan İnik , Bulent Turan

Abstract

The purpose of
this study is that using different deep learning models for classification of
14 different animals. Deep Learning, an area of artificial intelligence, has
been used in a wide range of recent years. Especially, it using in advanced
level of image processing, voice recognition and natural language processing
fields. One of the most important reasons for using a large field in image
analysis is that it performs the feature extraction itself on the image and
gives high accuracy results. It performs learning by creating at different
levels representations for each image. Unlike other machine learning methods,
there is no need of an expert for feature extraction on the images. Convolution
Neural Network (CNN), which is the basic architecture of deep learning models, consists
of different layers. These are Convolution Layer, ReLu Layer, Pooling Layer and
Full Connected Layer. Deep learning models are designed using different numbers
of these layers. AlexNet and VggNet models are used for classified of 14
different animals. These animals are Horse, Camel, Cow, Goat, Sheep, Wolf, Dog,
Cat, Deer, Pig, Bear, Leopard, Elephant and Kangaroo respectively. Animals that
are most likely to encounter when during driving road were selected. Because
thinking this work to be a preliminary work for the control of autonomous
vehicle driving. The images of animals are collected in color (RGB) on the
internet. In order to increase the data diversity, images were also taken from
the ready data sets. A total of 150 images were collected with 125 training and
25 test data for each animal. Two different data sets have been created, with
each image having dimensions of 224x224 and 227x227. As a result of the study,
the classification of the animals was realized with %91.2 accuracy with VggNet
and %67.65 with AlexNet. The high error rate in AlexNet is due to the small
number of layers in the network and the high selection of parameter values. For
example, the filter size in the convolution layer in AlexNet architecture is
11x11 and the number of stride is 4. This situation causes data loss in
transferring the information to the next layer. In contrast, VggNet has a
filter size of 3x3 and a number of steps of 1, there is no data loss in the
transfer to the next layer.

Keywords

AlexNet, CNN, Classification of Animals, Deep Learning, VggNet

References

Girshick, R., Donahue, J., Darrell, T., Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. 2014 Ieee Conference on Computer Vision and Pattern Recognition (Cvpr), 580-587.
Graves, A., Mohamed, A.-r., Hinton, G., 2013. Speech recognition with deep recurrent neural networks, Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, pp. 6645-6649.
He, K.M., Zhang, X.Y., Ren, S.Q., Sun, J., 2016. Deep Residual Learning for Image Recognition. 2016 Ieee Conference on Computer Vision and Pattern Recognition (Cpvr), 770-778.
Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., Blunsom, P., 2015. Teaching machines to read and comprehend, Advances in Neural Information Processing Systems, pp. 1693-1701.
Heuritech, 2018. https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/.
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 82-97.
Jarrett, K., Kavukcuoglu, K., LeCun, Y., 2009. What is the best multi-stage architecture for object recognition?, Computer Vision, 2009 IEEE 12th International Conference on. IEEE, pp. 2146-2153.
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y., 2016. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep convolutional neural networks. In NIPS’2012 . 23, 24, 27, 100, 200, 371, 456, 460.
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C., 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.
Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., 2016. Deep speech 2: End-to-end speech recognition in english and mandarin, International Conference on Machine Learning, pp. 173-182.
Le, Q.V., 2013. Building high-level features using large scale unsupervised learning, Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, pp. 8595-8598.
LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., Jackel, L.D., 1990. Handwritten digit recognition with a back-propagation network, Advances in neural information processing systems, pp. 396-404.
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324.
LeCun, Y., Huang, F.J., Bottou, L., 2004. Learning methods for generic object recognition with invariance to pose and lighting, Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. IEEE, pp. II-104.
Lenz, I., Lee, H., Saxena, A., 2015. Deep learning for detecting robotic grasps. The International Journal of Robotics Research 34, 705-724.
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D., 2016. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 0278364917710318.
Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., Huang, T., 2011. Large-scale image classification: fast feature extraction and svm training, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1689-1696.
Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440.
Luong, M.-T., Pham, H., Manning, C.D., 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You only look once: Unified, real-time object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788.
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y., 2016. End-to-end attention-based large vocabulary speech recognition, Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, pp. 4945-4949.
Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster R-CNN: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp. 91-99.
Ren, S.Q., He, K.M., Girshick, R., Sun, J., 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Ieee T Pattern Anal 39, 1137-1149.
Rohrbach, M., Stark, M., Schiele, B., 2011. Evaluating knowledge transfer and zero-shot learning in a large-scale setting, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1641-1648.
Sánchez, J., Perronnin, F., 2011. High-dimensional signature compression for large-scale image classification, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1665-1672.
Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Szegedy, C., Liu, W., Jia, Y.Q., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., 2015. Going Deeper with Convolutions. Proc Cvpr Ieee, 1-9.
Zeiler, M.D., Fergus, R., 2014. Visualizing and Understanding Convolutional Networks. Computer Vision - Eccv 2014, Pt I 8689, 818-833.
Bengio, S., Weston, J., Grangier, D., 2010. Label embedding trees for large multi-class tasks, Advances in Neural Information Processing Systems, pp. 163-171.
Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 1798-1828.
Coates, A., Ng, A., Lee, H., 2011. An analysis of single-layer networks in unsupervised feature learning, Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215-223.
Deng, J., Berg, A.C., Li, K., Fei-Fei, L., 2010. What does classifying more than 10,000 image categories tell us?, European conference on computer vision. Springer, pp. 71-84.
Deng, J., Satheesh, S., Berg, A.C., Li, F., 2011. Fast and balanced: Efficient label tree learning for large scale object recognition, Advances in Neural Information Processing Systems, pp. 567-575.
Deshpande, A., 2018. https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html.devblogs.nvidia.com, 2016. Deep Learning for Julia.
Girshick, R., 2015. Fast R-CNN. Ieee I Conf Comp Vis, 1440-1448.

Year 2018, Volume: 7 Issue: 1, 9 - 16, 20.04.2018

Özkan İnik , Bulent Turan

Abstract

References

Girshick, R., Donahue, J., Darrell, T., Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. 2014 Ieee Conference on Computer Vision and Pattern Recognition (Cvpr), 580-587.
Graves, A., Mohamed, A.-r., Hinton, G., 2013. Speech recognition with deep recurrent neural networks, Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, pp. 6645-6649.
He, K.M., Zhang, X.Y., Ren, S.Q., Sun, J., 2016. Deep Residual Learning for Image Recognition. 2016 Ieee Conference on Computer Vision and Pattern Recognition (Cpvr), 770-778.
Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., Blunsom, P., 2015. Teaching machines to read and comprehend, Advances in Neural Information Processing Systems, pp. 1693-1701.
Heuritech, 2018. https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/.
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 82-97.
Jarrett, K., Kavukcuoglu, K., LeCun, Y., 2009. What is the best multi-stage architecture for object recognition?, Computer Vision, 2009 IEEE 12th International Conference on. IEEE, pp. 2146-2153.
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y., 2016. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep convolutional neural networks. In NIPS’2012 . 23, 24, 27, 100, 200, 371, 456, 460.
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C., 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.
Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., 2016. Deep speech 2: End-to-end speech recognition in english and mandarin, International Conference on Machine Learning, pp. 173-182.
Le, Q.V., 2013. Building high-level features using large scale unsupervised learning, Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, pp. 8595-8598.
LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., Jackel, L.D., 1990. Handwritten digit recognition with a back-propagation network, Advances in neural information processing systems, pp. 396-404.
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324.
LeCun, Y., Huang, F.J., Bottou, L., 2004. Learning methods for generic object recognition with invariance to pose and lighting, Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. IEEE, pp. II-104.
Lenz, I., Lee, H., Saxena, A., 2015. Deep learning for detecting robotic grasps. The International Journal of Robotics Research 34, 705-724.
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D., 2016. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 0278364917710318.
Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., Huang, T., 2011. Large-scale image classification: fast feature extraction and svm training, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1689-1696.
Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440.
Luong, M.-T., Pham, H., Manning, C.D., 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You only look once: Unified, real-time object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788.
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y., 2016. End-to-end attention-based large vocabulary speech recognition, Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, pp. 4945-4949.
Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster R-CNN: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp. 91-99.
Ren, S.Q., He, K.M., Girshick, R., Sun, J., 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Ieee T Pattern Anal 39, 1137-1149.
Rohrbach, M., Stark, M., Schiele, B., 2011. Evaluating knowledge transfer and zero-shot learning in a large-scale setting, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1641-1648.
Sánchez, J., Perronnin, F., 2011. High-dimensional signature compression for large-scale image classification, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1665-1672.
Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Szegedy, C., Liu, W., Jia, Y.Q., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., 2015. Going Deeper with Convolutions. Proc Cvpr Ieee, 1-9.
Zeiler, M.D., Fergus, R., 2014. Visualizing and Understanding Convolutional Networks. Computer Vision - Eccv 2014, Pt I 8689, 818-833.
Bengio, S., Weston, J., Grangier, D., 2010. Label embedding trees for large multi-class tasks, Advances in Neural Information Processing Systems, pp. 163-171.
Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 1798-1828.
Coates, A., Ng, A., Lee, H., 2011. An analysis of single-layer networks in unsupervised feature learning, Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215-223.
Deng, J., Berg, A.C., Li, K., Fei-Fei, L., 2010. What does classifying more than 10,000 image categories tell us?, European conference on computer vision. Springer, pp. 71-84.
Deng, J., Satheesh, S., Berg, A.C., Li, F., 2011. Fast and balanced: Efficient label tree learning for large scale object recognition, Advances in Neural Information Processing Systems, pp. 567-575.
Deshpande, A., 2018. https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html.devblogs.nvidia.com, 2016. Deep Learning for Julia.
Girshick, R., 2015. Fast R-CNN. Ieee I Conf Comp Vis, 1440-1448.

There are 36 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Özkan İnik Bulent Turan
Publication Date	April 20, 2018
Published in Issue	Year 2018 Volume: 7 Issue: 1

Cite

APA	İnik, Ö., & Turan, B. (2018). Classification of Animals with Different Deep Learning Models. Journal of New Results in Science, 7(1), 9-16.
AMA	İnik Ö, Turan B. Classification of Animals with Different Deep Learning Models. JNRS. April 2018;7(1):9-16.
Chicago	İnik, Özkan, and Bulent Turan. “Classification of Animals With Different Deep Learning Models”. Journal of New Results in Science 7, no. 1 (April 2018): 9-16.
EndNote	İnik Ö, Turan B (April 1, 2018) Classification of Animals with Different Deep Learning Models. Journal of New Results in Science 7 1 9–16.
IEEE	Ö. İnik and B. Turan, “Classification of Animals with Different Deep Learning Models”, JNRS, vol. 7, no. 1, pp. 9–16, 2018.
ISNAD	İnik, Özkan - Turan, Bulent. “Classification of Animals With Different Deep Learning Models”. Journal of New Results in Science 7/1 (April 2018), 9-16.
JAMA	İnik Ö, Turan B. Classification of Animals with Different Deep Learning Models. JNRS. 2018;7:9–16.
MLA	İnik, Özkan and Bulent Turan. “Classification of Animals With Different Deep Learning Models”. Journal of New Results in Science, vol. 7, no. 1, 2018, pp. 9-16.
Vancouver	İnik Ö, Turan B. Classification of Animals with Different Deep Learning Models. JNRS. 2018;7(1):9-16.

Article Files

Full Text

TR Dizin 31688

EBSCO 30456

Electronic Journals Library	DOAJ	WorldCat
Scilit		SOBİAD

29388 JNRS is licensed under a Creative Commons Attribution-NonCommercial 4.0 International Licence (CC BY-NC).