Book My Show Interview Question | Dimensionality
Question
What Are Some Methods of Reducing Dimensionality?
(Hint- Tell us about them, along with some practical examples)
in progress
0
Machine Learning
55 years
1 Answer
646 views
Great Grand Master 0
Answer ( 1 )
1) Missing values ratio – Columns with too many missing values are removed as
they provide little information about that feature.
2) Low variance in columns – Any column which has very low variance will hardly
provide any useful information against the target variables.
3) Highly correlated features – If 2 features are highly correlated, one of them
can be removed as it will provide only slight useful info than the other variable.
4) Variable Importance plot – When building decision trees you can get a variable
importance plot which shows the relative importance of all the features with
respect to the strongest predictor. Then, he can select the important features from this
plot by setting a threshold.
5) Principal Component Analysis – This technique combines highly correlated features
into a single feature, thus resulting in dimensionality reduction.
6) Backward selection – We train the model on n features. Then we remove one feature at
a time whose removal has lead to the smallest increase in the error rate.
7) Forward Selection – This is the exact opposite process of backward selection.
We start with 1 feature and keep on adding additional features, one feature at a time which
leads to increase in the performance.