Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Forecasting Football Corner Odds: Statistical Modelling, Betting Strategies and Assessing Market Efficiency

Pålsson, Gustav LU and Laurens, Marcus LU (2023) In Master's Theses in Mathematical Sciences FMSM01 20231
Mathematical Statistics
Abstract
Statistical modelling could be included in a betting strategy where the value of a bet is assessed by comparing model predictions and market odds. This thesis presents several models based on statistical learning methods for predicting the total number of corners in a football match. Generalised linear regression and decision tree models were developed and their profitability was examined by using historical odds data. The models were trained and tested on recent seasons of the English Premier League. To further test the predictive strength, the models were tested on the German Bundesliga. Since the number of corners in a football match is count data but exhibits overdispersion, negative binomial regression was used to numerically model... (More)
Statistical modelling could be included in a betting strategy where the value of a bet is assessed by comparing model predictions and market odds. This thesis presents several models based on statistical learning methods for predicting the total number of corners in a football match. Generalised linear regression and decision tree models were developed and their profitability was examined by using historical odds data. The models were trained and tested on recent seasons of the English Premier League. To further test the predictive strength, the models were tested on the German Bundesliga. Since the number of corners in a football match is count data but exhibits overdispersion, negative binomial regression was used to numerically model the number of corners. This approach was accompanied by logistic regression as well as numerical-based and classification-based random forest models. The number of corners could be seen as a classification variable with the classes defined as above or below a certain number of corners, often referred to as the betting line on the over-under odds market.

The explanatory variables used to develop the models were match-by-match statistics from the Premier League, processed by creating averages of different lengths and supplemented by variables representing team capabilities and self-created variables representing current form and motivation. Backward stepwise selection and elastic net were used to select variables to include in the generalised linear regression models. The combinations of model approaches and methods resulted in fifteen possible models, which were assessed using statistical evaluation measures. Level stakes and the Kelly criterion were applied as betting strategies on the best-performing models for each method. Furthermore, the over-under betting market for corners was examined in order to identify potential asymmetries in the offered odds implying an inefficient market.

The results indicated that the best-performing models from each method were all profitable when tested on new data from the Premier League, despite having a low degree of explanatory power. On the contrary, the explanatory power and profitability decreased significantly when the Premier League-based models were tested on the Bundesliga without retraining leading to the majority of the models turning unprofitable. The analysis of the over-under market suggested that the under odds offered in the Premier League matches were generally undervalued, while this undervaluation was not statistically significant for the Bundesliga. (Less)
Popular Abstract
Predicting Odds and Betting on Football Corners

The large amount of available data originating from sporting events has led to a growing interest in sport-related quantitative analysis in recent years, especially in football. The accessible statistics could be used by bettors to make predictions for outcomes related to certain match events, instead of evaluating a betting offer on subjective grounds.

By using statistical learning methods, prediction models could be developed to calculate the probability of the outcome of football match statistics, which could be inversely expressed as odds. The odds offered on the betting market, reflecting the bookmaker's probability estimation for a certain outcome, could then be compared with the... (More)
Predicting Odds and Betting on Football Corners

The large amount of available data originating from sporting events has led to a growing interest in sport-related quantitative analysis in recent years, especially in football. The accessible statistics could be used by bettors to make predictions for outcomes related to certain match events, instead of evaluating a betting offer on subjective grounds.

By using statistical learning methods, prediction models could be developed to calculate the probability of the outcome of football match statistics, which could be inversely expressed as odds. The odds offered on the betting market, reflecting the bookmaker's probability estimation for a certain outcome, could then be compared with the predicted odds. Assuming that the model prediction probability is correctly calculated, there is value in the bet if the market odds is larger than the predicted odds. A small predicted odds imply a higher probability for the assessed outcome. Today, betting companies offer a wide range of betting opportunities. One of the most common offers is for the better to form an opinion on whether a certain match statistic will be over or under a given number, referred to as the line. Two strategies used in betting are either to place a wager of constant size when there exists value in the bet, called level stakes, or to adjust the wager depending on the difference between the predicted odds and market odds, called the Kelly criterion.

With publicly available match-by-match statistics over several seasons from the English Premier League used as explanatory variables, this thesis aimed to accurately predict the total number of corners in a football match and evaluate the potential profitability of the developed models by using historical odds from the over-under corner odds market. To account for differences between leagues, the prediction models were tested on recent seasons of the German Bundesliga. The number of corners in football matches could vary a lot depending on which teams are facing each other in the match, as well as the teams' statistics and ability to both attack and defend. Therefore, historical data of the teams together with supplementary variables could form a foundation for explaining how many corners there may be in a particular match.

Previous studies aiming to predict outcomes of football matches, such as match results or the number of goals, indicate that it is difficult to find suitable models with a sufficient degree of explanation. The corner count could be modelled either by predicting the number of expected corners or to forecast if the number is expected to be above or below the line. Regression and machine learning methods were utilised for the modelling. The over-under corner odds market was also assessed to find signs of mispricing and inefficiency.

It turned out that the generated models were profitable when backtested on the same league as they were trained on, but the explanatory power of the models was low. When the models were tested on another league, the degree of explanation decreased and the majority of the models became unprofitable.

The statistical test assessing potential market inefficiency concluded that there existed an asymmetry in the over-under corner odds market in the Premier League, while the phenomenon was not statistically significant for the Bundesliga. In both leagues, this asymmetry could be used to obtain profit. The largest profits were made using the Kelly criterion, as well as when the Kelly criterion was combined with taking advantage of the asymmetry in the over-under corner odds market. Based on statistical evaluation measures, the developed models were not clearly better than a naive model betting consistently on the under-odds offer. However, the results suggest the models can improve profitability despite having a low explanatory power when trained and used on the same league. (Less)
Please use this url to cite or link to this publication:
author
Pålsson, Gustav LU and Laurens, Marcus LU
supervisor
organization
course
FMSM01 20231
year
type
H2 - Master's Degree (Two Years)
subject
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMS-3476-2023
ISSN
1404-6342
other publication id
2023:E33
language
English
id
9127007
date added to LUP
2023-06-21 10:07:37
date last changed
2023-07-03 13:40:00
@misc{9127007,
  abstract     = {{Statistical modelling could be included in a betting strategy where the value of a bet is assessed by comparing model predictions and market odds. This thesis presents several models based on statistical learning methods for predicting the total number of corners in a football match. Generalised linear regression and decision tree models were developed and their profitability was examined by using historical odds data. The models were trained and tested on recent seasons of the English Premier League. To further test the predictive strength, the models were tested on the German Bundesliga. Since the number of corners in a football match is count data but exhibits overdispersion, negative binomial regression was used to numerically model the number of corners. This approach was accompanied by logistic regression as well as numerical-based and classification-based random forest models. The number of corners could be seen as a classification variable with the classes defined as above or below a certain number of corners, often referred to as the betting line on the over-under odds market.

The explanatory variables used to develop the models were match-by-match statistics from the Premier League, processed by creating averages of different lengths and supplemented by variables representing team capabilities and self-created variables representing current form and motivation. Backward stepwise selection and elastic net were used to select variables to include in the generalised linear regression models. The combinations of model approaches and methods resulted in fifteen possible models, which were assessed using statistical evaluation measures. Level stakes and the Kelly criterion were applied as betting strategies on the best-performing models for each method. Furthermore, the over-under betting market for corners was examined in order to identify potential asymmetries in the offered odds implying an inefficient market.

The results indicated that the best-performing models from each method were all profitable when tested on new data from the Premier League, despite having a low degree of explanatory power. On the contrary, the explanatory power and profitability decreased significantly when the Premier League-based models were tested on the Bundesliga without retraining leading to the majority of the models turning unprofitable. The analysis of the over-under market suggested that the under odds offered in the Premier League matches were generally undervalued, while this undervaluation was not statistically significant for the Bundesliga.}},
  author       = {{Pålsson, Gustav and Laurens, Marcus}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Forecasting Football Corner Odds: Statistical Modelling, Betting Strategies and Assessing Market Efficiency}},
  year         = {{2023}},
}