Entrepreneurship in the Age of Big Data: A Researcher's Guide to Data Mining, Inference and Prediction
(2015)- Abstract
- The bottleneck of Big Data today is the analysis of large amounts of information, including data mining, inference and prediction. Entrepreneurship researchers who want to take advantage of Big Data, need tools and workflows to find important trends and patterns within massive data sets, and understand "what the data tell us."
Here, we apply gradient boosting to big, register-based data, and establish the intensity of
individual-level risk factors to predict entrepreneurship entry. We find structural differences
between unincorporated and incorporated entry, and test two separate prediction trees: we
correctly predict 20.4% of incorporated entries, using only six risk factors. Data mining
... (More) - The bottleneck of Big Data today is the analysis of large amounts of information, including data mining, inference and prediction. Entrepreneurship researchers who want to take advantage of Big Data, need tools and workflows to find important trends and patterns within massive data sets, and understand "what the data tell us."
Here, we apply gradient boosting to big, register-based data, and establish the intensity of
individual-level risk factors to predict entrepreneurship entry. We find structural differences
between unincorporated and incorporated entry, and test two separate prediction trees: we
correctly predict 20.4% of incorporated entries, using only six risk factors. Data mining
techniques, like gradient boosting, offer unique opportunities for entrepreneurship researchers to use objective methods and learn from the data, prior to model inference and prediction. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/5046559
- author
- Witte, Frederik LU and Johnson, Alan R.
- organization
- publishing date
- 2015
- type
- Working paper/Preprint
- publication status
- unpublished
- subject
- language
- English
- LU publication?
- yes
- id
- d9a502d7-7235-4839-b05d-e22077bdb30f (old id 5046559)
- date added to LUP
- 2016-04-04 13:36:24
- date last changed
- 2018-11-21 21:15:04
@misc{d9a502d7-7235-4839-b05d-e22077bdb30f, abstract = {{The bottleneck of Big Data today is the analysis of large amounts of information, including data mining, inference and prediction. Entrepreneurship researchers who want to take advantage of Big Data, need tools and workflows to find important trends and patterns within massive data sets, and understand "what the data tell us."<br/><br> Here, we apply gradient boosting to big, register-based data, and establish the intensity of<br/><br> individual-level risk factors to predict entrepreneurship entry. We find structural differences<br/><br> between unincorporated and incorporated entry, and test two separate prediction trees: we<br/><br> correctly predict 20.4% of incorporated entries, using only six risk factors. Data mining<br/><br> techniques, like gradient boosting, offer unique opportunities for entrepreneurship researchers to use objective methods and learn from the data, prior to model inference and prediction.}}, author = {{Witte, Frederik and Johnson, Alan R.}}, language = {{eng}}, note = {{Working Paper}}, title = {{Entrepreneurship in the Age of Big Data: A Researcher's Guide to Data Mining, Inference and Prediction}}, year = {{2015}}, }