Advanced

ID3-SD: an algorithm for learning characteristic decision trees by controlling the degree of generalization

Davidsson, Paul (1995) In LU-CS-TR
Abstract
Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations.

Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. In this report, twomethods of reducing this problemby learning characteristic representations are presented. The central idea behind both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is accomplished by computing two limits (lower and upper) for every feature from the training instances... (More)
Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations.

Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. In this report, twomethods of reducing this problemby learning characteristic representations are presented. The central idea behind both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is accomplished by computing two limits (lower and upper) for every feature from the training instances belonging to the leaf. A subtree is then constructed from these limits that tests every feature; if the value is below the lower limit or above the upper limit for some feature, the instance will be rejected, i.e., regarded as belonging to a novel class. This subtree is then appended to the leaf. The first method presented corresponds to creating a maximum specific description, whereas the second is a novel method called ID3-SD that makes use of the information about the statistical distribution of the feature values that can be extracted from the training examples.

An important property of the novel method is that the gap between the limits (i.e., the degree of generalization) can be controlled. The two methods are then evaluated empirically in three different domains: the classic Iris database, a wine database, and a novel database from a coin-sorting machine. It is concluded that the dynamical properties of the ID3-SD method makes it preferable in many applications. Finally, we argue that thismethod in fact is general in that it, in principle, can be applied to any empirical learning algorithm, i.e., it is not restricted to decision tree algorithms. (Less)
Please use this url to cite or link to this publication:
author
publishing date
type
Book/Report
publication status
published
subject
keywords
decision trees, algorithms
in
LU-CS-TR
publisher
Department of Computer Science, Lund University
language
English
LU publication?
no
id
38817192-2eef-47fd-ae8a-b085327f9975 (old id 526552)
alternative location
http://fileadmin.cs.lth.se/ai/psfiles/TR-95-145.pdf
date added to LUP
2007-09-06 12:24:00
date last changed
2016-06-29 09:17:29
@misc{38817192-2eef-47fd-ae8a-b085327f9975,
  abstract     = {Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations.<br/><br>
Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. In this report, twomethods of reducing this problemby learning characteristic representations are presented. The central idea behind both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is accomplished by computing two limits (lower and upper) for every feature from the training instances belonging to the leaf. A subtree is then constructed from these limits that tests every feature; if the value is below the lower limit or above the upper limit for some feature, the instance will be rejected, i.e., regarded as belonging to a novel class. This subtree is then appended to the leaf. The first method presented corresponds to creating a maximum specific description, whereas the second is a novel method called ID3-SD that makes use of the information about the statistical distribution of the feature values that can be extracted from the training examples.<br/><br>
An important property of the novel method is that the gap between the limits (i.e., the degree of generalization) can be controlled. The two methods are then evaluated empirically in three different domains: the classic Iris database, a wine database, and a novel database from a coin-sorting machine. It is concluded that the dynamical properties of the ID3-SD method makes it preferable in many applications. Finally, we argue that thismethod in fact is general in that it, in principle, can be applied to any empirical learning algorithm, i.e., it is not restricted to decision tree algorithms.},
  author       = {Davidsson, Paul},
  keyword      = {decision trees,algorithms},
  language     = {eng},
  publisher    = {ARRAY(0x86464d8)},
  series       = {LU-CS-TR},
  title        = {ID3-SD: an algorithm for learning characteristic decision trees by controlling the degree of generalization},
  year         = {1995},
}