JIO Interview Question | Gender
Question
If you’re attempting to predict a customer’s gender, and you only have 100 data points, what problems could arise?
in progress
0
Interview Question
4 years
2 Answers
1030 views
Great Grand Master 0
Answers ( 2 )
– since dataset is small results may not be generalizable
– if dataset is imbalanced then we won’t be able to predict minority class with with accuracy
admin /moderators can add further points and any feedback or suggestion is appreciated
1. Unbalanced data – foe example, a possibility that one of the genders is 90% of the data, leading to misclassification.
2. Incomplete data- Sometimes users do not want to mention their gender, hence we may have nulls too (out of those 100 data points).
3. Insufficient size – only 100 as the sample size would not help us build a robust model. Variance may be high (overfitting) as we might try our best to fit our model on this small data.