Improving Cloud Service Resilience using Brownout-Aware Load-Balancing

Klein, Cristian; Papadopoulos, Alessandro Vittorio; Dellkrantz, Manfred; Dürango, Jonas; Maggio, Martina; Årzén, Karl-Erik; Hernàndez-Rodriguez, Francisco; Elmroth, Erik

Improving Cloud Service Resilience using Brownout-Aware Load-Balancing

Mark

Klein, Cristian ; Papadopoulos, Alessandro Vittorio ^LU ; Dellkrantz, Manfred ^LU ; Dürango, Jonas ^LU ; Maggio, Martina ^LU

; Årzén, Karl-Erik ^LU

; Hernàndez-Rodriguez, Francisco and Elmroth, Erik (2014) 33rd IEEE International Symposium on Reliable Distributed Systems p.31-40

Abstract: We focus on improving resilience of cloud services (e.g., e-commerce website), when correlated or cascading failures lead to computing capacity shortage. We study how to extend the classical cloud service architecture composed of a load-balancer and replicas with a recently proposed self-adaptive paradigm called brownout. Such services are able to reduce their capacity requirements by degrading user experience (e.g., disabling recommendations).

Combining resilience with the brownout paradigm is to date an open practical problem. The issue is to ensure that replica self-adaptivity would not confuse the load-balancing algorithm, overloading replicas that are already struggling with capacity shortage. For example, load-balancing... (More); We focus on improving resilience of cloud services (e.g., e-commerce website), when correlated or cascading failures lead to computing capacity shortage. We study how to extend the classical cloud service architecture composed of a load-balancer and replicas with a recently proposed self-adaptive paradigm called brownout. Such services are able to reduce their capacity requirements by degrading user experience (e.g., disabling recommendations).

Combining resilience with the brownout paradigm is to date an open practical problem. The issue is to ensure that replica self-adaptivity would not confuse the load-balancing algorithm, overloading replicas that are already struggling with capacity shortage. For example, load-balancing strategies based on response times are not able to decide which replicas should be selected, since the response times are already controlled by the brownout paradigm.

In this paper we propose two novel brownout-aware load-balancing algorithms. To test their practical applicability, we extended the popular lighttpd web server and load-balancer, thus obtaining a production-ready implementation. Experimental evaluation shows that the approach enables cloud services to remain responsive despite cascading failures. Moreover, when compared to Shortest Queue First (SQF), believed to be near-optimal in the non-adaptive case, our algorithms improve user experience by 5%, with high statistical significance, while preserving response time predictability. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/4698578

author

Klein, Cristian ; Papadopoulos, Alessandro Vittorio ^LU ; Dellkrantz, Manfred ^LU ; Dürango, Jonas ^LU ; Maggio, Martina ^LU

; Årzén, Karl-Erik ^LU

; Hernàndez-Rodriguez, Francisco and Elmroth, Erik

organization

publishing date

2014

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Control Engineering

host publication

[Host publication title missing]

pages

10 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

33rd IEEE International Symposium on Reliable Distributed Systems

conference location

Nara, Japan

conference dates

2014-10-07

external identifiers

scopus:84938932324
wos:000380439400004

DOI

10.1109/SRDS.2014.14

project

EIT_VR CLOUD Cloud Control

LCCC

language

English

LU publication?

yes

id

315a6d32-02c1-4a3c-aa62-ebae1a7c2921 (old id 4698578)

date added to LUP

2016-04-04 10:26:16

date last changed

2026-04-10 02:18:31

@inproceedings{315a6d32-02c1-4a3c-aa62-ebae1a7c2921,
  abstract     = {{We focus on improving resilience of cloud services (e.g., e-commerce website), when correlated or cascading failures lead to computing capacity shortage. We study how to extend the classical cloud service architecture composed of a load-balancer and replicas with a recently proposed self-adaptive paradigm called brownout. Such services are able to reduce their capacity requirements by degrading user experience (e.g., disabling recommendations). <br/><br>
Combining resilience with the brownout paradigm is to date an open practical problem. The issue is to ensure that replica self-adaptivity would not confuse the load-balancing algorithm, overloading replicas that are already struggling with capacity shortage. For example, load-balancing strategies based on response times are not able to decide which replicas should be selected, since the response times are already controlled by the brownout paradigm. <br/><br>
In this paper we propose two novel brownout-aware load-balancing algorithms. To test their practical applicability, we extended the popular lighttpd web server and load-balancer, thus obtaining a production-ready implementation. Experimental evaluation shows that the approach enables cloud services to remain responsive despite cascading failures. Moreover, when compared to Shortest Queue First (SQF), believed to be near-optimal in the non-adaptive case, our algorithms improve user experience by 5%, with high statistical significance, while preserving response time predictability.}},
  author       = {{Klein, Cristian and Papadopoulos, Alessandro Vittorio and Dellkrantz, Manfred and Dürango, Jonas and Maggio, Martina and Årzén, Karl-Erik and Hernàndez-Rodriguez, Francisco and Elmroth, Erik}},
  booktitle    = {{[Host publication title missing]}},
  language     = {{eng}},
  pages        = {{31--40}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Improving Cloud Service Resilience using Brownout-Aware Load-Balancing}},
  url          = {{https://lup.lub.lu.se/search/files/5538820/4698579.pdf}},
  doi          = {{10.1109/SRDS.2014.14}},
  year         = {{2014}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Improving Cloud Service Resilience using Brownout-Aware Load-Balancing