Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Prediction of quote acceptance in a B2B environment using Random Forests and Gradient Boosting Machines

Gunnarsson, Jesper LU and Tyrberg, Jacob LU (2021) In Master's Theses in Mathematical Sciences FMSM01 20211
Mathematical Statistics
Abstract
For a business to be as successful as possible it needs a sound pricing strategy. A B2B environment allows the business more freedom to tailor each quote to maximize the performance.In order to do this, proper understanding of how probable a quote is to succeed is crucial. This work employs a statistical approach to predict the probability of acceptance based on historical data. Two different architectures for models were mainly used to compute the probability of acceptance, Gradient Boosting Machines and Random Forests. To improve the models, feature engineering, feature selection, hyperparameter optimization and probability calibration were used. Each step was evaluated in order to determine its success. Feature engineering, using domain... (More)
For a business to be as successful as possible it needs a sound pricing strategy. A B2B environment allows the business more freedom to tailor each quote to maximize the performance.In order to do this, proper understanding of how probable a quote is to succeed is crucial. This work employs a statistical approach to predict the probability of acceptance based on historical data. Two different architectures for models were mainly used to compute the probability of acceptance, Gradient Boosting Machines and Random Forests. To improve the models, feature engineering, feature selection, hyperparameter optimization and probability calibration were used. Each step was evaluated in order to determine its success. Feature engineering, using domain knowledge from sales, significantly improved the results, by 10 percentage points in the models’ F1-score. The final binary classification results for the two models are similar, both producing ca 90% F1-score. Where the two models differ is in the behaviour when a single explanatory variable, the price of the quote, is altered. GBM produces probabilities that are more aligned with expectations from experts. The results show that direct price optimization is difficult to use, regardless of the model, as the probabilities are not entirely trustworthy. The thesis proves the possibility of working with quote prediction using quantitative methods, but also highlights the many challenges it poses for a company. (Less)
Please use this url to cite or link to this publication:
author
Gunnarsson, Jesper LU and Tyrberg, Jacob LU
supervisor
organization
alternative title
Prediktion av offertacceptans i en B2B miljö med Random Forest och Gradient Boosting Machines
course
FMSM01 20211
year
type
H2 - Master's Degree (Two Years)
subject
keywords
B2B, pricing, Random Forest, Gradient Boosting, calibration, feature engineering
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMS-3406-2021
ISSN
1404-6342
other publication id
2021:E4
language
English
id
9042046
date added to LUP
2021-04-09 14:15:55
date last changed
2021-06-03 15:27:09
@misc{9042046,
  abstract     = {{For a business to be as successful as possible it needs a sound pricing strategy. A B2B environment allows the business more freedom to tailor each quote to maximize the performance.In order to do this, proper understanding of how probable a quote is to succeed is crucial. This work employs a statistical approach to predict the probability of acceptance based on historical data. Two different architectures for models were mainly used to compute the probability of acceptance, Gradient Boosting Machines and Random Forests. To improve the models, feature engineering, feature selection, hyperparameter optimization and probability calibration were used. Each step was evaluated in order to determine its success. Feature engineering, using domain knowledge from sales, significantly improved the results, by 10 percentage points in the models’ F1-score. The final binary classification results for the two models are similar, both producing ca 90% F1-score. Where the two models differ is in the behaviour when a single explanatory variable, the price of the quote, is altered. GBM produces probabilities that are more aligned with expectations from experts. The results show that direct price optimization is difficult to use, regardless of the model, as the probabilities are not entirely trustworthy. The thesis proves the possibility of working with quote prediction using quantitative methods, but also highlights the many challenges it poses for a company.}},
  author       = {{Gunnarsson, Jesper and Tyrberg, Jacob}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Prediction of quote acceptance in a B2B environment using Random Forests and Gradient Boosting Machines}},
  year         = {{2021}},
}