What is the trade off between bias and variance?
Question
What’s the trade-off between bias and variance?
in progress
1
Machine Learning
4 years
3 Answers
1385 views
Contributor 0
Answers ( 3 )
The bias-variance decomposition essentially decomposes the learning error from any algorithm by adding the bias, variance, and a bit of irreducible error due to noise in the underlying dataset.
Necessarily, if you make the model more complex and add more variables, you’ll lose bias but gain variance. To get the optimally-reduced amount of error, you’ll have to trade off bias and variance. Neither high bias nor high variance is desired.
High bias and low variance algorithms train models that are consistent, but inaccurate on average.
High variance and low bias algorithms train models that are accurate but inconsistent
bias is defined as the assumptions made by the model regarding the data . high bias means model is being too simplistic or a simple approach is used but the data demands a bit more complex function to fit . this is called underfitting.
high variance means the model is essentially fitting very well to the training data but fails to perform well on the test data. or the model is using a complicated function to fit the data whereas a bit simpler approach is required this is called overftting
The bias variance trade off lies at the heart of every Machine Learning algorithm.
The aim of any machine learning algorithm is to have low variance and low bias.
What happens many times is that when we train the model in such a way that the
model learns too much from the data and also picks up the noise in the data.
In such cases, the model might fantastic accuracy on the training data but the moment
it is exposed to test data or the data which it has not encountered before, it does not give
good predictions. Such model is said to have high variance.
To tackle this issue, we need to introduce a little bit of bias in the model, so that our accuracy
might suffer in the training set but it will give overall good predictions on the data it has not
encountered before.
To give an example, if we consider Linear Regression, there are hardly few relationships in
real world which will be perfectly linear, but by introducing linearity, we make sure that the
model will give overall good predictions over a period of time.
To get a good predictive model with high prediction accuracy on unseen data we have to build a model that doesn’t have a high bias or high variance means we have to do trade-off between them. If we increase the complexity of algo then it overfits the training data and doesn’t generalize well on unseen dataset means it has high variance and if we build a simple model that gives any random average fit as a model then it is of no use for prediction it has low accuracy on both test and train data it is known as high bias model .so to get a good predictive model which can be used for making prediction on real world problems we have to do trade-off between bias and variance.