Register Now

Login

Lost Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Login

Register Now

It will take less than 1 minute to register for lifetime. Bonus Tip - We don't send OTP to your email id Make Sure to use your own email id for free books and giveaways

Story of Bias, Variance, Bias-Variance Trade-Off

Why do we predict?
We predict in order to identify the trend of the future by using our sample data set. Whenever we create a model, we try to create a formula out of our sample data set. And the aim of this formula is to satisfy all the possible conditions of the universe.
Mathematicians and Statisticians all across the globe try to create a perfect model that can answer future questions.

Thus we create a model, and this model is bound to have some error. Why? Because we can’t cover all the possible combinations to fit in one formula. The error or difference between the actual and predicted value is called prediction error.

Bias – It is the difference between the average prediction of the model with the actual values. A model with HIGH bias will create a very simple model and it will be far away from the actual values in both train and test data set

Examples of low-bias machine learning algorithms include: Decision Trees, k-Nearest Neighbors and SVM.

Examples of high-bias machine learning algorithms include: Linear Regression, Linear Discriminant Analysis and Logistic Regression

Variance – Variance refers to the spread of our data. A model with high variance will be so specific in its training dataset that it tries to cover all the points while training the data which results in high training accuracy but low test accuracy

Examples of low-variance machine learning algorithms include: Linear Regression, Linear Discriminant Analysis and Logistic Regression.

Examples of high-variance machine learning algorithms include: Decision Trees, k-Nearest Neighbors and Support Vector Machines.

A simple model with high bias (left) a complicated model with high variance in the left

As you can see, the line in the left tries to cover all the points, so it creates a complicated model which is very accurate in the training data set.

Let’s see how an under fitting, over fitting, and good model looks like

As you can see, A high variance occurs in a model that tries to create a complicated formula on the training data set.
A high bias model is very generic. Matlab aiwaiey kuch v average bna diya

If you want to understand the mathematics behind these errors, then below is the formula

The above formula has 3 terms, the first term is the bias square, second is the variance and third is the irreducible error.
No matter what, you can’t remove the irreducible error. It is the measure of noise in the data and you can’t have a noiseless data set.

When you have a very limited dataset then there is a high chance of getting a under-fitting data set(High Bias and Low Variance)
When you have very noisy data then the model tries to fit in a complicated model which might result in over-fitting on the training dataset(High Variance and Low Bias)

What is the bias-variance trade-off?
The trade-off between bias and variance is done to minimize the overall error(formula above)

Error = Reducible Error+Irreducible Error
Reducible error = (Bias)^2 + Variance

Estimated Mean Square Error(Pic. from Quora)

Let’s try to ease out the formula for Bias and Variance
Bias =Estimation of target-target
Variance of estimates = (Target – Estimated target)^2
 The variance error measure how much our target function would differ if a new training data was used.

To keep all the errors positive, we have bias square, variance(which itself is a squared value) and irreducible error squared

The biasvariance tradeoff is the property of a set of predictive models whereby models with a lower bias in parameter estimation have a higher variance of the parameter estimates across samples, and vice versa.

How do we actually try to make bias-variance trade-off?
There are multiple methods for B-V Trade-off
-Separate training and testing dataset
-Cross-Validation
-Good Performance metrics
-Fitting model parameters

Keep Learning 🙂
The Data Monk

About TheDataMonkNewbie

I am the Co-Founder of The Data Monk. I have a total of 4+ years of analytics experience with 3+ years at Mu Sigma and 1 year at OYO. I am an active trader and a logically sarcastic idiot :)

Follow Me

Leave a reply