Answer ( 1 )

  1. Decision trees start with splitting the most important predictor and then go
    on splitting the other predictors which lead to maximum reduction in RSS
    (Rresidual Sum of Squares) for regression trees and maximum reduction in
    classification error rate(alternatives are Gini Index and Entropy) for classification trees.

    The decision tree will keep on growing if some stopping criterion is not imposed and it will
    try to cover all the cases in the training set. This will cause the tree to overfit and it will perform
    well on train data but can create problems on test data.

    To tackle this issue, we impose a stopping criterion when building decision trees, like you
    need to have a particular amount of reduction in RSS, particular no of splits, max depth etc.
    Due to this, the trees don’t grow much large in size and are able to generalize well on the training set.
    Another method is to let the tree grow up to the full size and then remove the branches which don’t provide
    much value. This is called pruning of decision trees.

Leave an answer

Browse
Browse