Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Product Classification - A Hierarchical Approach

Karlsson, Mikael LU and Karlstedt, Anton LU (2016) In LU-CS-EX 2016-31 EDA920 20161
Department of Computer Science
Abstract
The social and environmental impact associated with consuming a product is something that is becoming increasingly important to consumers and businesses alike. This impact can in theory be computed simply by classifying a product and by mapping the classified product to the corresponding life-cycle assessments research. However, this type of mapping requires an extensive product taxonomy with a large plurality of categories, which makes classification through machine learning a non-trivial task.
This thesis describes the implementation of a hierarchical product classifier using the Python library scikit-learn, that makes it possible to automatically classify products based primarily on their brands and titles into a large taxonomy.
Using... (More)
The social and environmental impact associated with consuming a product is something that is becoming increasingly important to consumers and businesses alike. This impact can in theory be computed simply by classifying a product and by mapping the classified product to the corresponding life-cycle assessments research. However, this type of mapping requires an extensive product taxonomy with a large plurality of categories, which makes classification through machine learning a non-trivial task.
This thesis describes the implementation of a hierarchical product classifier using the Python library scikit-learn, that makes it possible to automatically classify products based primarily on their brands and titles into a large taxonomy.
Using a data set of 3.1 million products spread over 800 categories, we trained a set of hierarchical classifiers using different learning algorithms. Evaluations of these algorithms showed that the best hierarchical classifier reached a hierarchical F1-score (hF1) of 0.85. (Less)
Please use this url to cite or link to this publication:
author
Karlsson, Mikael LU and Karlstedt, Anton LU
supervisor
organization
alternative title
Classifying products to compute their socio-ecological impact
course
EDA920 20161
year
type
H3 - Professional qualifications (4 Years - )
subject
keywords
product classification, hierarchical classification, text classification, UNSPSC, GTIN, scikit-learn
publication/series
LU-CS-EX 2016-31
report number
LU-CS-EX 2016-31
ISSN
1650-2884
language
English
id
8889613
date added to LUP
2016-08-29 13:30:10
date last changed
2016-08-29 13:30:10
@misc{8889613,
  abstract     = {{The social and environmental impact associated with consuming a product is something that is becoming increasingly important to consumers and businesses alike. This impact can in theory be computed simply by classifying a product and by mapping the classified product to the corresponding life-cycle assessments research. However, this type of mapping requires an extensive product taxonomy with a large plurality of categories, which makes classification through machine learning a non-trivial task.
This thesis describes the implementation of a hierarchical product classifier using the Python library scikit-learn, that makes it possible to automatically classify products based primarily on their brands and titles into a large taxonomy.
Using a data set of 3.1 million products spread over 800 categories, we trained a set of hierarchical classifiers using different learning algorithms. Evaluations of these algorithms showed that the best hierarchical classifier reached a hierarchical F1-score (hF1) of 0.85.}},
  author       = {{Karlsson, Mikael and Karlstedt, Anton}},
  issn         = {{1650-2884}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{LU-CS-EX 2016-31}},
  title        = {{Product Classification - A Hierarchical Approach}},
  year         = {{2016}},
}