The problems encountered in the analysis of data sets with undersized sample mainly arise from the singular covariance structure. As a solution to this problem, non-singular Hybrid Covariance Estimators (HCEs) have been proposed in the literature. Several multivariate statistical techniques where HCEs are used continue to be developed and introduced. One of these is the Hybrid Regression Model (HRM). Thanks to HCEs, since there is no longer the rank problem in covariance matrix, in HRM analysis the regression coefficients can be estimated as many as the number of variables. However, determining the best predictors in regression model is one of the biggest problems for researchers since the number of variables increases and there is insufficient knowledge about the model. Therefore, some numerical optimization techniques and strategies are required to explain such a wide solution space where the number of alternative subsets of candidate models of predictors can reach millions. In this paper, we introduced a new and alternative approach to variable selection for undersized sample data by using the Genetic Algorithm (GA) and Information Complexity Criteria (ICOMP) as a fitness function in the HRM analysis. To demonstrate the ability of proposed method, we carried out the Monte Carlo simulation study with correlated and undersized data sets. We compared our method with Elastic Net (EN) modeling. According to results, the proposed method can be recommended as an alternative approach to select variable in undersized sample data.
Genetic Algorithm Hybrid Regression Model Information Complexity Variable Selection Undersized Sample Problem
Primary Language | English |
---|---|
Journal Section | TJST |
Authors | |
Publication Date | March 3, 2020 |
Submission Date | December 11, 2019 |
Published in Issue | Year 2020 Volume: 15 Issue: 1 |