Product Classification - A Hierarchical Approach
(2016) In LU-CS-EX 2016-31 EDA920 20161Department of Computer Science
- Abstract
- The social and environmental impact associated with consuming a product is something that is becoming increasingly important to consumers and businesses alike. This impact can in theory be computed simply by classifying a product and by mapping the classified product to the corresponding life-cycle assessments research. However, this type of mapping requires an extensive product taxonomy with a large plurality of categories, which makes classification through machine learning a non-trivial task.
This thesis describes the implementation of a hierarchical product classifier using the Python library scikit-learn, that makes it possible to automatically classify products based primarily on their brands and titles into a large taxonomy.
Using... (More) - The social and environmental impact associated with consuming a product is something that is becoming increasingly important to consumers and businesses alike. This impact can in theory be computed simply by classifying a product and by mapping the classified product to the corresponding life-cycle assessments research. However, this type of mapping requires an extensive product taxonomy with a large plurality of categories, which makes classification through machine learning a non-trivial task.
This thesis describes the implementation of a hierarchical product classifier using the Python library scikit-learn, that makes it possible to automatically classify products based primarily on their brands and titles into a large taxonomy.
Using a data set of 3.1 million products spread over 800 categories, we trained a set of hierarchical classifiers using different learning algorithms. Evaluations of these algorithms showed that the best hierarchical classifier reached a hierarchical F1-score (hF1) of 0.85. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/8889613
- author
- Karlsson, Mikael LU and Karlstedt, Anton LU
- supervisor
- organization
- alternative title
- Classifying products to compute their socio-ecological impact
- course
- EDA920 20161
- year
- 2016
- type
- H3 - Professional qualifications (4 Years - )
- subject
- keywords
- product classification, hierarchical classification, text classification, UNSPSC, GTIN, scikit-learn
- publication/series
- LU-CS-EX 2016-31
- report number
- LU-CS-EX 2016-31
- ISSN
- 1650-2884
- language
- English
- id
- 8889613
- date added to LUP
- 2016-08-29 13:30:10
- date last changed
- 2016-08-29 13:30:10
@misc{8889613, abstract = {{The social and environmental impact associated with consuming a product is something that is becoming increasingly important to consumers and businesses alike. This impact can in theory be computed simply by classifying a product and by mapping the classified product to the corresponding life-cycle assessments research. However, this type of mapping requires an extensive product taxonomy with a large plurality of categories, which makes classification through machine learning a non-trivial task. This thesis describes the implementation of a hierarchical product classifier using the Python library scikit-learn, that makes it possible to automatically classify products based primarily on their brands and titles into a large taxonomy. Using a data set of 3.1 million products spread over 800 categories, we trained a set of hierarchical classifiers using different learning algorithms. Evaluations of these algorithms showed that the best hierarchical classifier reached a hierarchical F1-score (hF1) of 0.85.}}, author = {{Karlsson, Mikael and Karlstedt, Anton}}, issn = {{1650-2884}}, language = {{eng}}, note = {{Student Paper}}, series = {{LU-CS-EX 2016-31}}, title = {{Product Classification - A Hierarchical Approach}}, year = {{2016}}, }