Model with high R^2 and low performance in Linear and Logistic Regression
Question
Is it possible to have a Linear or Logistic Regression model with high R^2 and low performance/use in your project
in progress
0
Machine Learning
3 years
1 Answer
604 views
Grand Master 0
Answer ( 1 )
R-squared is the percentage of the dependent variable variation that the model explains. The value in your statistical output is an estimate of the population value that is based on your sample. Like other estimates in inferential statistics, you want your R-squared estimate to be close to the population value.
It is possible to have high r2, but low performance because:
1. R-squared is a biased estimate
2. Overfitting your model
3. Correlations among any variable x and output variable