[Skip Global Navigation]

Worldwide Offices   Store   View Cart

AnswerTree®

AnswerTree Home

Buy NowBuy Now DemoDemo UpgradeUpgrade

How decision tree results are different in AnswerTree 3.1/3.0

AnswerTree 3.1/3.0 has implemented consistent rules for handling ties between predictors when determining splits. In the past, AnswerTree did not always follow consistent rules about what to do in case of a tie. The most obvious result of these rules is that some trees developed in previous versions of AnswerTree will look different if built in AnswerTree 3.1/3.0.

Because these tie-breaking rules affect more extreme situations and are often arbitrary decisions, the results in any version are equally valid. For example, in C&RT;, the order that the predictor was added to the model-an arbitrary rule-determines the split. The difference is that now you can understand and explain how a split is made in these situations. It is also worth noting that in these situations it may be worth using AnswerTree's interactive tree building features to choose the predictor, since you may be more interested in including one attribute instead of another equally significant predictor.

The algorithms use different rules to determine which predictor to use. For example, with a continuous target, CHAID will choose the predictor whose split gives the most significantly different groups of the target. For a categorical target, CHAID selects the predictor whose measure of independence with the target is smallest. (See Appendix E in the AnswerTree User's Guide for details.) If there is a "tie" between two predictors, then AnswerTree needs to choose one predictor over another. In 3.1/3.0 the rules are always consistent:

For CHAID & Exhaustive CHAID:

  1. The split rule with lower p value is listed first
  2. In case of tie, the rule with higher F (or Chi-Square) is listed first
  3. In case of tie, the rule with lower DF is listed first
  4. In case of tie, the rule whose associated predictor variable was added first to the model is listed first

For QUEST:

  1. The split rule with lower p value is listed first
  2. In case of tie,
    1. Order among all the tied Chi-Square. The larger value listed first.
    2. If there are ties in the item above, order by DF. The smaller DF listed first.
    3. Order among all the tied F (F or Levene's F). The larger value listed first.
    4. If there are ties in the item above, order by DF1+DF2 with the larger sum listed first. If there still is a tie, order by DF1 with the smaller value listed first. (note: sort by DF1+DF2 will put the predictor with fewer missing values before predictor with more missing values)
    5. To set ordering among these two groups (Chi-Square group and F group), list first the predictor added first to the model

C&RT; has not changed. The rules are as follows:

  1. The split rule with higher improvement is listed first
  2. In case of tie, the rule whose associated predictor variable was added first to the model is listed first

Read about other features added to AnswerTree 3.1/3.0.

Buy NowBuy Now DemoDemo UpgradeUpgrade