Abstract:
In this Big-data and computational innovation era, advanced
level analysis and modelling strategies are essential in data science to
understanding the individual activities which occur within very complex
behavioral, socio-economic and ecological systems. However, the scales
at which models can be developed, and the subsequent problems they
can inform, are often limited by our inability or challenges to effectively
understand data that mimic interactions at the finest spatial, temporal,
or organizational resolutions. Linear regression analysis is the one
of the widely used methods for investigating such relationship between
variables. Multicollinearity is one of the major problem in regression
analysis. Multicollinearity can be reduced by using the appropriate regularized
regression methods. This study aims to measure the robustness
of regularized regression models such as ridge and Lasso type models
designed for the high dimensional data having the multicollinearity
problems. Empirical results show that Lasso and Ridge models have less
residual sum of squares values. Findings also demonstrate an improved
accuracy of estimated parameters on the best model.