Learning characteristic decision trees
(1995)- Abstract
- Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations. Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. Two methods of reducing this problem by learning characteristic representations are presented. The central idea behind
both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is done by computing two limits (lower and upper) for every feature from the training instances belonging to... (More) - Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations. Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. Two methods of reducing this problem by learning characteristic representations are presented. The central idea behind
both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is done by computing two limits (lower and upper) for every feature from the training instances belonging to the
leaf. A subtree is then constructed from these limits that tests every feature; if the value is below the lower limit or above the upper limit for some feature, the instancewill be rejected,
i.e., regarded as belonging to a novel class. This subtree is then appended to the leaf. The first method presented corresponds to creating a maximumspecific description, whereas the second is a novel method that makes use of the information about the statistical distribution of the feature values that can be extracted from the training examples. An important property of the novel method is that the degree of generalization can be controlled. The methods
are evaluated empirically in two different domains, the Iris classification problem and a novel coin classification problem. It is concluded that the dynamical properties of the second method makes it preferable in most applications. Finally, we argue that this method is
very general in that it, in principle, can be applied to any empirical learning algorithm. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/526555
- author
- Davidsson, Paul
- publishing date
- 1995
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- decision trees, algorithms
- host publication
- AI, 1995 : Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, Canberra, Australia 13 - 17 November 1995
- language
- English
- LU publication?
- no
- id
- 06f4965b-06c6-4644-80b1-6130e20dd56a (old id 526555)
- alternative location
- http://fileadmin.cs.lth.se/ai/psfiles/AI-95.pdf
- date added to LUP
- 2016-04-04 13:28:57
- date last changed
- 2018-11-21 21:14:17
@inproceedings{06f4965b-06c6-4644-80b1-6130e20dd56a, abstract = {{Decision trees constructed by ID3-like algorithms suffer from an inability of detecting instances of categories not present in the set of training examples, i.e., they are discriminative representations. Instead, such instances are assigned to one of the classes actually present in the training set, resulting in undesired misclassifications. Two methods of reducing this problem by learning characteristic representations are presented. The central idea behind<br/><br> both methods is to augment each leaf of the decision tree with a subtree containing additional information concerning each feature’s values in that leaf. This is done by computing two limits (lower and upper) for every feature from the training instances belonging to the<br/><br> leaf. A subtree is then constructed from these limits that tests every feature; if the value is below the lower limit or above the upper limit for some feature, the instancewill be rejected,<br/><br> i.e., regarded as belonging to a novel class. This subtree is then appended to the leaf. The first method presented corresponds to creating a maximumspecific description, whereas the second is a novel method that makes use of the information about the statistical distribution of the feature values that can be extracted from the training examples. An important property of the novel method is that the degree of generalization can be controlled. The methods<br/><br> are evaluated empirically in two different domains, the Iris classification problem and a novel coin classification problem. It is concluded that the dynamical properties of the second method makes it preferable in most applications. Finally, we argue that this method is<br/><br> very general in that it, in principle, can be applied to any empirical learning algorithm.}}, author = {{Davidsson, Paul}}, booktitle = {{AI, 1995 : Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, Canberra, Australia 13 - 17 November 1995}}, keywords = {{decision trees; algorithms}}, language = {{eng}}, title = {{Learning characteristic decision trees}}, url = {{http://fileadmin.cs.lth.se/ai/psfiles/AI-95.pdf}}, year = {{1995}}, }