Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Entrepreneurship in the Age of Big Data: A Researcher's Guide to Data Mining, Inference and Prediction

Witte, Frederik LU and Johnson, Alan R. (2015)
Abstract
The bottleneck of Big Data today is the analysis of large amounts of information, including data mining, inference and prediction. Entrepreneurship researchers who want to take advantage of Big Data, need tools and workflows to find important trends and patterns within massive data sets, and understand "what the data tell us."

Here, we apply gradient boosting to big, register-based data, and establish the intensity of

individual-level risk factors to predict entrepreneurship entry. We find structural differences

between unincorporated and incorporated entry, and test two separate prediction trees: we

correctly predict 20.4% of incorporated entries, using only six risk factors. Data mining

... (More)
The bottleneck of Big Data today is the analysis of large amounts of information, including data mining, inference and prediction. Entrepreneurship researchers who want to take advantage of Big Data, need tools and workflows to find important trends and patterns within massive data sets, and understand "what the data tell us."

Here, we apply gradient boosting to big, register-based data, and establish the intensity of

individual-level risk factors to predict entrepreneurship entry. We find structural differences

between unincorporated and incorporated entry, and test two separate prediction trees: we

correctly predict 20.4% of incorporated entries, using only six risk factors. Data mining

techniques, like gradient boosting, offer unique opportunities for entrepreneurship researchers to use objective methods and learn from the data, prior to model inference and prediction. (Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Working paper/Preprint
publication status
unpublished
subject
language
English
LU publication?
yes
id
d9a502d7-7235-4839-b05d-e22077bdb30f (old id 5046559)
date added to LUP
2016-04-04 13:36:24
date last changed
2018-11-21 21:15:04
@misc{d9a502d7-7235-4839-b05d-e22077bdb30f,
  abstract     = {{The bottleneck of Big Data today is the analysis of large amounts of information, including data mining, inference and prediction. Entrepreneurship researchers who want to take advantage of Big Data, need tools and workflows to find important trends and patterns within massive data sets, and understand "what the data tell us."<br/><br>
Here, we apply gradient boosting to big, register-based data, and establish the intensity of<br/><br>
individual-level risk factors to predict entrepreneurship entry. We find structural differences<br/><br>
between unincorporated and incorporated entry, and test two separate prediction trees: we<br/><br>
correctly predict 20.4% of incorporated entries, using only six risk factors. Data mining<br/><br>
techniques, like gradient boosting, offer unique opportunities for entrepreneurship researchers to use objective methods and learn from the data, prior to model inference and prediction.}},
  author       = {{Witte, Frederik and Johnson, Alan R.}},
  language     = {{eng}},
  note         = {{Working Paper}},
  title        = {{Entrepreneurship in the Age of Big Data: A Researcher's Guide to Data Mining, Inference and Prediction}},
  year         = {{2015}},
}