Bir Gizli Katmanlı Yapay Sinir Ağlarında Optimal Nöron Sayısının İncelenmesi
Year 2022,
Volume: 17 Issue: 2, 303 - 325, 25.11.2022
Tayfun Ünal
,
Ünver Çiftçi
,
Nurkut Nuray Urgan
Abstract
Bu makalede, bir gizli katmanlı yapay sinir ağları için optimal nöron sayısı araştırılmıştır. Bunun için teorik ve istatiksel çalışmalar yapılmıştır. Optimal nöron sayısını bulmak için global minimum bulmak gereklidir. Ancak yapay sinir ağlarının eğitimi konveks olmayan bir problem olduğundan optimizasyon algoritmaları ile global minimum bulmak zordur. Bu çalışmada global minimumu dolayısıyla optimum nöron sayısını bulmak için baskı maliyet fonksiyonu önerilmiştir. Baskı maliyet fonksiyonu yardımıyla global minimumu veren yapay sinir ağı modelinin nöron sayısının, optimal nöron sayısını verdiği gösterilmiştir. Ayrıca baskı maliyet fonksiyonu XOR veri kümesi ve daire veri kümesi üzerinde test edilmiş ve XOR veri kümesi üzerinde %99, daire veri kümesi üzerinde ise %97 başarı elde edilmiştir. Bu veri kümeleri için optimal nöron sayısı tespit edilmiştir.
References
- I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. MIT Press, Cambridge, 2016.
- M. Şahan ve Y. Okur, “Akdeniz bölgesine ait meteorolojik veriler kullanılarak yapay sinir ağları yardımıyla güneş enerjisinin tahmini,” Süleyman Demirel Üniversitesi Fen Edeb. Fakültesi Fen Derg., 11 (1), 61–71, 2016.
- M. Şahan, “Yapay sinir ağları ve angström-prescott denklemleri kullanılarak Gaziantep, Antakya ve Kahramanmaraş için global güneş radyasyonu tahmini,” Süleyman Demirel Üniversitesi Fen Edeb. Fakültesi Fen Derg., 16 (2), 368–384, 2021.
- A. Zhang, Z. C. Lipton, M. Li and A. J. Smola, “Dive into deep learning, ” arXiv, 2020.
- C. F. Higham and D. J. Higham, “Deep learning: an introduction for applied mathematicians,” SIAM Rev., 61 (4), 860–891, 2019.
- R. Vidal, J. Bruna, R. Giryes and S. Soatto, “Mathematics of deep learning,” arXiv, 2017.
- K. P. Murphy, Probabilistic Machine Learning: An introduction. MIT Press, Cambridge, 2022.
- C. M. Bishop, Pattern Recognition and Machine Learning. Springer, New York 2021.
- F. Chollet, Deep Learning with Python. 2nd Ed. Manning Publications, 2017.
- A. Géron, Hands-On Machine Learning with Scikit-Learn. O'Reilly, USA, 2017.
- B. D. Haeffele and R. Vidal, “Global optimality in neural network training,” CVPR, 2 (3), 4390-4398, 2017.
- R. Setiono, “A penalty-function approach for pruning feedforward neural networks,” Neural Comput., 9 (1), 185–204, 1997.
- Z. Zhang and J. Qiao, “A node pruning algorithm for feedforward neural network based on neural complexity,” Proc. 2010 Int. Conf. Intell. Control Inf. Process., 1, 406–410, 2010.
- M. M. Bejani and M. Ghatee, “A systematic review on overfitting control in shallow and deep neural networks,” Artif. Intell. Rev., 54 (8), 6391–6438, 2021.
- X. Wu, P. Rózycki and B. M. Wilamowski, “A hybrid constructive algorithm for single-layer feedforward networks learning,” IEEE Trans. Neural Netw. Learn. Syst., 26, 1659–1668, 2015.
- J. Qiao, F. Li, H. Han and W. Li, “Constructive algorithm for fully connected cascade feedforward neural networks,” Neurocomputing, 182, 154–164, 2016.
- Y. Bengio, N. L. Roux, P. Vincent, O. Delalleau and P. Marcotte, “Convex neural networks,” Adv. Neural Inf. Process. Syst., 123–130, 2005.
- C. L. P. Chen and Z. Liu, “Broad learning system: an effective and efficient incremental learning system without the need for deep architecture,” IEEE Trans. Neural Netw. Learn. Syst., 29 (1), 10–24, 2018.
- W. J. Puma-Villanueva, E. P. dos Santos and F. J. Von Zuben, “A constructive algorithm to synthesize arbitrarily connected feedforward neural networks,” Neurocomputing, 75 (1), 14–32, 2012.
- J. L. Subirats, L. Franco and J. M. Jerez, “C-Mantec: a novel constructive neural network algorithm incorporating competition between neurons,” Neural Netw., 26, 130–140, 2012.
- G. M. Augasta and T. Kathirvalavakumar, “A novel pruning algorithm for optimizing feedforward neural network of classification problems,” Neural Process. Lett., 34, 241–258, 2011.
- P. Molchanov, A. Mallya, S. Tyree, I. Frosio and J. Kautz, “Importance estimation for neural network pruning,” CVPR, 11256–11264, 2019.
- G. Castellano, A. M. Fanelli and M. Pelillo, “An iterative pruning algorithm for feedforward neural networks,” IEEE Trans. Neural Netw., 8, 519–531, 1997.
- Q. Chang, J. Wang, H. Zhang, L. Shi, J. Wang and N. R. Pal, “Structure optimization of neural networks with l1 regularization on gates,” Comput. Intell., 196–203, 2019.
- J. F. Qiao, Y. Zhang and H. G. Han, “Fast unit pruning algorithm for feedforward neural network design,” Appl. Math. Comput., 205, 622–627, 2008.
- H. Z. Alemu, W. Wu and J. Zhao, “Feedforward neural networks with a hidden layer regularization method,” Symmetry (Basel)., 10, 2018.
- A. Bondarenko, A. Borisov and L. Aleksejeva, “Neurons vs weights pruning in artificial neural networks,” Vide. Tehnol. Resur. - Environ. Technol. Resour., 3, 22–28, 2015.
- R. Reed, “Pruning Algorithms-a survey,” IEEE Trans. Neural Netw., 4 (5), 740–747, 1993.
- B. Hassibi, D. G. Stork and G. J. Wolff, “Optimal brain surgeon and general network pruning,” IEEE International Conference on Neural Networks, 1, 293–299, 1993.
- X. Xie, H. Zhang, J. Wang, Q. Chang, J. Wang and N. R. Pal, “Learning optimized structure of neural networks by hidden node pruning with l1 regularization,” IEEE Trans. Cybern., 50 (3), 1333–1346, 2020.
- O. Aran, O. T. Yildiz and E. Alpaydin, “An incremental framework based on cross-validation for estimating the architecture of a multilayer perceptron,” Int. J. Pattern Recognit. Artif. Intell., 23 (2), 159–190, 2009.
- H. G. Han, S. Zhang, ve J. F. Qiao, “An adaptive growing and pruning algorithm for designing recurrent neural network,” Neurocomputing, 242, 51–62, 2017.
- R. Zemouri, N. Omri, F. Fnaiech, N. Zerhouni, ve N. Fnaiech, “A new growing pruning deep learning neural network algorithm (GP-DLNN),” Neural Comput. Appl., 32, 18143–18159, 2020.
- A. Gordon, E. Eban, O. Nachum, B. Chen, H. Wu, T.-J. Yang and E. Choi, “MorphNet: fast & simple resource-constrained structure learning of deep networks,” CVPR, 1586-1595, 2018.
- K. Kawaguchi and L. P. Kaelbling, “Elimination of all bad local minima in deep learning,” arXiv, 2019.
- M. Anthony and P. L. Bartlett, Neural Network Learning: Theoretical Foundations.
Cambridge University Press, Cambridge, 2009.
- R. Reed and R. J. Marks II, Neural Smithing Supervised Learning in Feedforward Artificial Neural Networks. MIT Press, Cambridge, 2016.
- E. Apaydın, Introduction to Machine Learning. 3rd Ed., MIT Press, Cambridge, 2014.
Investigation of the Optimal Number of Neuron in One-Hidden-Layer Artificial Neural Networks
Year 2022,
Volume: 17 Issue: 2, 303 - 325, 25.11.2022
Tayfun Ünal
,
Ünver Çiftçi
,
Nurkut Nuray Urgan
Abstract
In this paper, optimal number of neurons in one-hidden-layer artificial neural networks is investigated. Theoretical and statistical studies are carried out for this goal. Finding the global minimum is necessary in order to determine the optimal number of neurons. However, since the training of artificial neural networks is a non-convex problem, it is difficult to find a global minimum with optimization algorithms. In this study, an augmented cost function is proposed to find the global minimum, hence the optimal number of neurons. It is shown that the optimal number of neurons is produced by the artificial neural network model, which gives the global minimum with the aid of the augmented cost function. Additionally, the XOR and circle datasets are used to test the augmented cost function, and 99% success was achieved on the XOR dataset and 97% on the circle dataset. The optimal number of neurons is determined for these datasets.
References
- I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. MIT Press, Cambridge, 2016.
- M. Şahan ve Y. Okur, “Akdeniz bölgesine ait meteorolojik veriler kullanılarak yapay sinir ağları yardımıyla güneş enerjisinin tahmini,” Süleyman Demirel Üniversitesi Fen Edeb. Fakültesi Fen Derg., 11 (1), 61–71, 2016.
- M. Şahan, “Yapay sinir ağları ve angström-prescott denklemleri kullanılarak Gaziantep, Antakya ve Kahramanmaraş için global güneş radyasyonu tahmini,” Süleyman Demirel Üniversitesi Fen Edeb. Fakültesi Fen Derg., 16 (2), 368–384, 2021.
- A. Zhang, Z. C. Lipton, M. Li and A. J. Smola, “Dive into deep learning, ” arXiv, 2020.
- C. F. Higham and D. J. Higham, “Deep learning: an introduction for applied mathematicians,” SIAM Rev., 61 (4), 860–891, 2019.
- R. Vidal, J. Bruna, R. Giryes and S. Soatto, “Mathematics of deep learning,” arXiv, 2017.
- K. P. Murphy, Probabilistic Machine Learning: An introduction. MIT Press, Cambridge, 2022.
- C. M. Bishop, Pattern Recognition and Machine Learning. Springer, New York 2021.
- F. Chollet, Deep Learning with Python. 2nd Ed. Manning Publications, 2017.
- A. Géron, Hands-On Machine Learning with Scikit-Learn. O'Reilly, USA, 2017.
- B. D. Haeffele and R. Vidal, “Global optimality in neural network training,” CVPR, 2 (3), 4390-4398, 2017.
- R. Setiono, “A penalty-function approach for pruning feedforward neural networks,” Neural Comput., 9 (1), 185–204, 1997.
- Z. Zhang and J. Qiao, “A node pruning algorithm for feedforward neural network based on neural complexity,” Proc. 2010 Int. Conf. Intell. Control Inf. Process., 1, 406–410, 2010.
- M. M. Bejani and M. Ghatee, “A systematic review on overfitting control in shallow and deep neural networks,” Artif. Intell. Rev., 54 (8), 6391–6438, 2021.
- X. Wu, P. Rózycki and B. M. Wilamowski, “A hybrid constructive algorithm for single-layer feedforward networks learning,” IEEE Trans. Neural Netw. Learn. Syst., 26, 1659–1668, 2015.
- J. Qiao, F. Li, H. Han and W. Li, “Constructive algorithm for fully connected cascade feedforward neural networks,” Neurocomputing, 182, 154–164, 2016.
- Y. Bengio, N. L. Roux, P. Vincent, O. Delalleau and P. Marcotte, “Convex neural networks,” Adv. Neural Inf. Process. Syst., 123–130, 2005.
- C. L. P. Chen and Z. Liu, “Broad learning system: an effective and efficient incremental learning system without the need for deep architecture,” IEEE Trans. Neural Netw. Learn. Syst., 29 (1), 10–24, 2018.
- W. J. Puma-Villanueva, E. P. dos Santos and F. J. Von Zuben, “A constructive algorithm to synthesize arbitrarily connected feedforward neural networks,” Neurocomputing, 75 (1), 14–32, 2012.
- J. L. Subirats, L. Franco and J. M. Jerez, “C-Mantec: a novel constructive neural network algorithm incorporating competition between neurons,” Neural Netw., 26, 130–140, 2012.
- G. M. Augasta and T. Kathirvalavakumar, “A novel pruning algorithm for optimizing feedforward neural network of classification problems,” Neural Process. Lett., 34, 241–258, 2011.
- P. Molchanov, A. Mallya, S. Tyree, I. Frosio and J. Kautz, “Importance estimation for neural network pruning,” CVPR, 11256–11264, 2019.
- G. Castellano, A. M. Fanelli and M. Pelillo, “An iterative pruning algorithm for feedforward neural networks,” IEEE Trans. Neural Netw., 8, 519–531, 1997.
- Q. Chang, J. Wang, H. Zhang, L. Shi, J. Wang and N. R. Pal, “Structure optimization of neural networks with l1 regularization on gates,” Comput. Intell., 196–203, 2019.
- J. F. Qiao, Y. Zhang and H. G. Han, “Fast unit pruning algorithm for feedforward neural network design,” Appl. Math. Comput., 205, 622–627, 2008.
- H. Z. Alemu, W. Wu and J. Zhao, “Feedforward neural networks with a hidden layer regularization method,” Symmetry (Basel)., 10, 2018.
- A. Bondarenko, A. Borisov and L. Aleksejeva, “Neurons vs weights pruning in artificial neural networks,” Vide. Tehnol. Resur. - Environ. Technol. Resour., 3, 22–28, 2015.
- R. Reed, “Pruning Algorithms-a survey,” IEEE Trans. Neural Netw., 4 (5), 740–747, 1993.
- B. Hassibi, D. G. Stork and G. J. Wolff, “Optimal brain surgeon and general network pruning,” IEEE International Conference on Neural Networks, 1, 293–299, 1993.
- X. Xie, H. Zhang, J. Wang, Q. Chang, J. Wang and N. R. Pal, “Learning optimized structure of neural networks by hidden node pruning with l1 regularization,” IEEE Trans. Cybern., 50 (3), 1333–1346, 2020.
- O. Aran, O. T. Yildiz and E. Alpaydin, “An incremental framework based on cross-validation for estimating the architecture of a multilayer perceptron,” Int. J. Pattern Recognit. Artif. Intell., 23 (2), 159–190, 2009.
- H. G. Han, S. Zhang, ve J. F. Qiao, “An adaptive growing and pruning algorithm for designing recurrent neural network,” Neurocomputing, 242, 51–62, 2017.
- R. Zemouri, N. Omri, F. Fnaiech, N. Zerhouni, ve N. Fnaiech, “A new growing pruning deep learning neural network algorithm (GP-DLNN),” Neural Comput. Appl., 32, 18143–18159, 2020.
- A. Gordon, E. Eban, O. Nachum, B. Chen, H. Wu, T.-J. Yang and E. Choi, “MorphNet: fast & simple resource-constrained structure learning of deep networks,” CVPR, 1586-1595, 2018.
- K. Kawaguchi and L. P. Kaelbling, “Elimination of all bad local minima in deep learning,” arXiv, 2019.
- M. Anthony and P. L. Bartlett, Neural Network Learning: Theoretical Foundations.
Cambridge University Press, Cambridge, 2009.
- R. Reed and R. J. Marks II, Neural Smithing Supervised Learning in Feedforward Artificial Neural Networks. MIT Press, Cambridge, 2016.
- E. Apaydın, Introduction to Machine Learning. 3rd Ed., MIT Press, Cambridge, 2014.