Share
PhonePe Interview Question | Model
Question
What happens to our linear regression model if the column z in the data is a sum of columns x and y and some random noise?
in progress
0
Machine Learning
55 years
2 Answers
1222 views
Great Grand Master 0
Answers ( 2 )
In our linear regression model if the column z in the data is a sum of columns x and y and some random noise then we can say that there is multicollinearity in our data.Because of multicollinearity in the dataset the adjusted-R2 score will be a bit high than the r2 score because of the noise and multicollinearity.
This is a problem because predictor variables should be independent. If degree of correlation is high it can cause problems when training the model and interpreting the results.
key goal of regression analysis is to isolate the relationship between each independent variable and the dependent variable. The interpretation of a regression coefficient is that it represents the mean change in the dependent variable for each 1 unit change in an independent variable when you hold all of the other independent variables constant. That last portion is crucial for our discussion about multicollinearity.
The idea is that you can change the value of one independent variable and not the others. However, when independent variables are correlated, it indicates that changes in one variable are associated with shifts in another variable. The stronger the correlation, the more difficult it is to change one variable without changing another. It becomes difficult for the model to estimate the relationship between each independent variable and the dependent variable independently because the independent variables tend to change in unison.