I have 10 independent variable, how to identify the important variable for my Linear regression model ?
Question
Is p-value the only way to identify the important variable?
in progress
0
Machine Learning
55 years
3 Answers
1040 views
Contributor 0
Answers ( 3 )
Yes, the p-value is the first and most priority to check the significance of a variable. But there is another method like checking the correlation with target variable, add features, and check the adjusted r2 value whether it increases or decreases. If it increases then it is a significant variable else not.
we can use p value although we can use feature selection techniques like rfecv(recursive feature elimination) or lasso regression can also be used for feature selection . we can also use scatter plots and corelation heatmap. finally pca may also be used but since the number of independent variables is less pca should not be used
1) Apart from checking individual p-values, you can also use step() function in R for
feature selection. The step function builds many different models internally using different
combinations of predictor variables and chooses the model with the least AIC.
2) You can use Variable Importance Plot using decision trees which gives you the relative
importance of all the features with respect to the most important one.