Analys av försäljningsdata från Google Adwords med EM-algoritmen

Tobjörk, David; Bertilsson, Erik

Analys av försäljningsdata från Google Adwords med EM-algoritmen

Mark

Tobjörk, David ^LU and Bertilsson, Erik ^LU (2016) STAH11 20152
Department of Statistics

Abstract: The purpose of this paper is to find a model that describes the distribution of Google Adwords order data and to find an appropriate method to estimate the expected value for an order. We do this on behalf of the marketing agency Precis Digital. We receive data for one of their customers. To begin with we fit some widely used probability distributions using the Maximum-Likelihood method and investigate how well these distributions describe the empirical data. We assume that our data is not derived from only one probability distribution. Therefore we go further by fitting a mixture of normal distributions with the EM-algorithm and see how well it fits the data. We evaluate the previous single distributions and the mixed distributions using... (More); The purpose of this paper is to find a model that describes the distribution of Google Adwords order data and to find an appropriate method to estimate the expected value for an order. We do this on behalf of the marketing agency Precis Digital. We receive data for one of their customers. To begin with we fit some widely used probability distributions using the Maximum-Likelihood method and investigate how well these distributions describe the empirical data. We assume that our data is not derived from only one probability distribution. Therefore we go further by fitting a mixture of normal distributions with the EM-algorithm and see how well it fits the data. We evaluate the previous single distributions and the mixed distributions using Bayesian information criterion. It turns out that a mixture model fit the data quite well and better than previous single distributions. We evaluate how well mixed models describe data with few observations by using the EM-algorithm on individual campaigns. We come up with the conclusion that mixture distributions are good at describing the distribution of campaigns with few observations. However the estimated expected value of an order does not differ much from an estimation done with the mean value. Another interesting result is that subpopulations of customers probably exist since a mixed model better fits the data. In further research it would be interesting to deeper investigate these subpopulations. These subpopulations probably represent different customers with certain buying behavior. By further investigating these subpopulations campaigns can be optimized. (Less)
Abstract (Swedish): Syftet med uppsatsen är att finna en modell som väl beskriver fördelningen för orderdata från Google Adwords och att finna ett lämpligt sätt att skatta det förväntade värdet för en order. Vi gör detta på uppdrag av marknadsföringsbyrån Precis Digital. Från dem erhåller vi data för en av deras kunder. Till att börja med anpassar vi ett antal olika sannolikhetsfördelningar med Maximum-Likelihood-metoden och undersöker hur väl de beskriver det empiriska datamaterialet. Vi ser tecken på att data inte härstammar från endast en sannolikhetsfördelning. Därmed går vi vidare i att undersöka hur väl en blandning av flera normalfördelningar beskriver data. Vi utvärderar de tidigare enskilda fördelningarna och de blandade fördelningarna med Bayesian... (More); Syftet med uppsatsen är att finna en modell som väl beskriver fördelningen för orderdata från Google Adwords och att finna ett lämpligt sätt att skatta det förväntade värdet för en order. Vi gör detta på uppdrag av marknadsföringsbyrån Precis Digital. Från dem erhåller vi data för en av deras kunder. Till att börja med anpassar vi ett antal olika sannolikhetsfördelningar med Maximum-Likelihood-metoden och undersöker hur väl de beskriver det empiriska datamaterialet. Vi ser tecken på att data inte härstammar från endast en sannolikhetsfördelning. Därmed går vi vidare i att undersöka hur väl en blandning av flera normalfördelningar beskriver data. Vi utvärderar de tidigare enskilda fördelningarna och de blandade fördelningarna med Bayesian information criterion. Det visar sig att blandade fördelningar bättre beskriver data än vad enbart en fördelning gör. Vi prövar hur väl blandade fördelningar beskriver mindre data genom att använda EM-algoritmen på enskilda kampanjer. Vi konstaterar här att blandade fördelningar är ett bra sätt att beskriva kampanjer med färre data. Däremot skiljer sig inte det skattade förväntade värdet för en order särskilt mycket från en skattning gjord med det aritmetiska medelvärdet. Ett annat intressant resultat är att subpopulationer av kunder förmodligen föreligger då en blandfördelning bättre beskriver data. Det vore intressant att i vidare studier djupare undersöka dessa subpopulationer, då de troligtvis representerar olika kundgrupper med olika köpbeteenden. Genom att undersöka dessa subpopulationer kan man bättre anpassa sina kampanjer. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8884972

author

Tobjörk, David ^LU and Bertilsson, Erik ^LU

supervisor

Peter Gustafsson ^LU

organization

Department of Statistics

course

STAH11 20152

year

2016

type

M2 - Bachelor Degree

subject

Mathematics and Statistics

keywords

Adwords, Expectation maximization, Marknadsföring, Subpopulationer, Blandfördelningar

language

Swedish

id

8884972

date added to LUP

2016-06-27 12:08:34

date last changed

2016-06-27 12:08:34

@misc{8884972,
  abstract     = {{The purpose of this paper is to find a model that describes the distribution of Google Adwords order data and to find an appropriate method to estimate the expected value for an order. We do this on behalf of the marketing agency Precis Digital. We receive data for one of their customers. To begin with we fit some widely used probability distributions using the Maximum-Likelihood method and investigate how well these distributions describe the empirical data. We assume that our data is not derived from only one probability distribution. Therefore we go further by fitting a mixture of normal distributions with the EM-algorithm and see how well it fits the data. We evaluate the previous single distributions and the mixed distributions using Bayesian information criterion. It turns out that a mixture model fit the data quite well and better than previous single distributions. We evaluate how well mixed models describe data with few observations by using the EM-algorithm on individual campaigns. We come up with the conclusion that mixture distributions are good at describing the distribution of campaigns with few observations. However the estimated expected value of an order does not differ much from an estimation done with the mean value. Another interesting result is that subpopulations of customers probably exist since a mixed model better fits the data. In further research it would be interesting to deeper investigate these subpopulations. These subpopulations probably represent different customers with certain buying behavior. By further investigating these subpopulations campaigns can be optimized.}},
  author       = {{Tobjörk, David and Bertilsson, Erik}},
  language     = {{swe}},
  note         = {{Student Paper}},
  title        = {{Analys av försäljningsdata från Google Adwords med EM-algoritmen}},
  year         = {{2016}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Analys av försäljningsdata från Google Adwords med EM-algoritmen