Reinforcement Learning for Real Time Bidding

Smith, Erik

Reinforcement Learning for Real Time Bidding

Mark

Smith, Erik ^LU (2019) In LU-CS-EX 2019-10 EDAM05 20191
Department of Computer Science

Abstract: When an internet user opens a web page containing an advertising slot, how is it determined which ad is shown? Today, the most common software-based approach to trading advertising slots is real time bidding: as soon as the user begins to load the web page, an auction for the slot is held in real time, and the highest bidder gets to display their advertisement of choice. Auction bidding is performed by different demand side platforms (DSPs). Emerse AB, where this master's thesis work was carried out, owns and operates such a DSP. Each bidder (Emerse and competing DSPs) has a limited advertising budget, and strives to spend it in a manner that maximizes the value of the advertisement slots bought. In this thesis, we formalize this problem... (More); When an internet user opens a web page containing an advertising slot, how is it determined which ad is shown? Today, the most common software-based approach to trading advertising slots is real time bidding: as soon as the user begins to load the web page, an auction for the slot is held in real time, and the highest bidder gets to display their advertisement of choice. Auction bidding is performed by different demand side platforms (DSPs). Emerse AB, where this master's thesis work was carried out, owns and operates such a DSP. Each bidder (Emerse and competing DSPs) has a limited advertising budget, and strives to spend it in a manner that maximizes the value of the advertisement slots bought. In this thesis, we formalize this problem by modelling the bidding process as a Markov decision process. To find the optimal auction bid, two different solution methods are proposed: value iteration and actor–critic policy gradients. The effectiveness of the value iteration Markov decision process approach (versus other common baselines methods) is demonstrated on real-world auction data. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8994653

author

Smith, Erik ^LU

supervisor

organization

Department of Computer Science

course

EDAM05 20191

year

2019

type

H2 - Master's Degree (Two Years)

subject

Technology and Engineering

keywords

Reinforcement learning, Markov decision process, value iteration, policy gradient, real time bidding

publication/series

LU-CS-EX 2019-10

ISSN

1650-2884

language

English

id

8994653

date added to LUP

2019-09-16 11:54:11

date last changed

2019-09-20 09:08:49

@misc{8994653,
  abstract     = {{When an internet user opens a web page containing an advertising slot, how is it determined which ad is shown? Today, the most common software-based approach to trading advertising slots is real time bidding: as soon as the user begins to load the web page, an auction for the slot is held in real time, and the highest bidder gets to display their advertisement of choice. Auction bidding is performed by different demand side platforms (DSPs). Emerse AB, where this master's thesis work was carried out, owns and operates such a DSP. Each bidder (Emerse and competing DSPs) has a limited advertising budget, and strives to spend it in a manner that maximizes the value of the advertisement slots bought. In this thesis, we formalize this problem by modelling the bidding process as a Markov decision process. To find the optimal auction bid, two different solution methods are proposed: value iteration and actor–critic policy gradients. The effectiveness of the value iteration Markov decision process approach (versus other common baselines methods) is demonstrated on real-world auction data.}},
  author       = {{Smith, Erik}},
  issn         = {{1650-2884}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{LU-CS-EX 2019-10}},
  title        = {{Reinforcement Learning for Real Time Bidding}},
  year         = {{2019}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Reinforcement Learning for Real Time Bidding