Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Learning characteristic decision trees

Davidsson, Paul (1995)
Abstract
Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations. Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. Two methods of reducing this problem by learning characteristic representations are presented. The central idea behind

both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is done by computing two limits (lower and upper) for every feature from the training instances belonging to... (More)
Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations. Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. Two methods of reducing this problem by learning characteristic representations are presented. The central idea behind

both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is done by computing two limits (lower and upper) for every feature from the training instances belonging to the

leaf. A subtree is then constructed from these limits that tests every feature; if the value is below the lower limit or above the upper limit for some feature, the instancewill be rejected,

i.e., regarded as belonging to a novel class. This subtree is then appended to the leaf. The first method presented corresponds to creating a maximumspecific description, whereas the second is a novel method that makes use of the information about the statistical distribution of the feature values that can be extracted from the training examples. An important property of the novel method is that the degree of generalization can be controlled. The methods

are evaluated empirically in two different domains, the Iris classification problem and a novel coin classification problem. It is concluded that the dynamical properties of the second method makes it preferable in most applications. Finally, we argue that this method is

very general in that it, in principle, can be applied to any empirical learning algorithm. (Less)
Please use this url to cite or link to this publication:
author
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
decision trees, algorithms
host publication
AI, 1995 : Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, Canberra, Australia 13 - 17 November 1995
language
English
LU publication?
no
id
06f4965b-06c6-4644-80b1-6130e20dd56a (old id 526555)
alternative location
http://fileadmin.cs.lth.se/ai/psfiles/AI-95.pdf
date added to LUP
2016-04-04 13:28:57
date last changed
2018-11-21 21:14:17
@inproceedings{06f4965b-06c6-4644-80b1-6130e20dd56a,
  abstract     = {{Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations. Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. Two methods of reducing this problem by learning characteristic representations are presented. The central idea behind<br/><br>
both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is done by computing two limits (lower and upper) for every feature from the training instances belonging to the<br/><br>
leaf. A subtree is then constructed from these limits that tests every feature; if the value is below the lower limit or above the upper limit for some feature, the instancewill be rejected,<br/><br>
i.e., regarded as belonging to a novel class. This subtree is then appended to the leaf. The first method presented corresponds to creating a maximumspecific description, whereas the second is a novel method that makes use of the information about the statistical distribution of the feature values that can be extracted from the training examples. An important property of the novel method is that the degree of generalization can be controlled. The methods<br/><br>
are evaluated empirically in two different domains, the Iris classification problem and a novel coin classification problem. It is concluded that the dynamical properties of the second method makes it preferable in most applications. Finally, we argue that this method is<br/><br>
very general in that it, in principle, can be applied to any empirical learning algorithm.}},
  author       = {{Davidsson, Paul}},
  booktitle    = {{AI, 1995 : Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, Canberra, Australia 13 - 17 November 1995}},
  keywords     = {{decision trees; algorithms}},
  language     = {{eng}},
  title        = {{Learning characteristic decision trees}},
  url          = {{http://fileadmin.cs.lth.se/ai/psfiles/AI-95.pdf}},
  year         = {{1995}},
}