Walmart Labs Interview Question | Missing Values
Question
What will you do if removing missing values from a dataset causes bias? Is there any other alternative for the same?
in progress
1
Statistics
55 years
3 Answers
969 views
Great Grand Master 0
Answers ( 3 )
Yes, we can fill the missing values by mean, median or mode or knn imutation.
We can also use the missing columns as target variable and run machine learning algorithm with other features to predict the value of the missing columns.
we can inspect the reasons for the values being missing.
Values can be missing at random, not missing at random,
missing completely at random.
When values are missing completely at random, removing them generally wont
create any bias.
In rest of the cases, depending upon the scenario, we can decide to impute them with mean,
median, knn-imputation etc.
Two common ways of handling missing values:
1. Imputation with mean, median and mode
2. Predicting the missing values by using predictive modeling