Decision trees
Question
What is pruning in case of decision trees?
in progress
0
Machine Learning
4 years
1 Answer
740 views
Master 0
Answer ( 1 )
Decision trees start with splitting the most important predictor and then go
on splitting the other predictors which lead to maximum reduction in RSS
(Rresidual Sum of Squares) for regression trees and maximum reduction in
classification error rate(alternatives are Gini Index and Entropy) for classification trees.
The decision tree will keep on growing if some stopping criterion is not imposed and it will
try to cover all the cases in the training set. This will cause the tree to overfit and it will perform
well on train data but can create problems on test data.
To tackle this issue, we impose a stopping criterion when building decision trees, like you
need to have a particular amount of reduction in RSS, particular no of splits, max depth etc.
Due to this, the trees don’t grow much large in size and are able to generalize well on the training set.
Another method is to let the tree grow up to the full size and then remove the branches which don’t provide
much value. This is called pruning of decision trees.