Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement Learning

Nayak Seetanadi, Gautham; Maggio, Martina; Årzén, Karl-Erik

Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement Learning

Mark

Nayak Seetanadi, Gautham ^LU ; Maggio, Martina ^LU and Årzén, Karl-Erik ^LU

(2020) 28th International Conference on Real-Time Networks and Systems p.149-160

Abstract: Time-critical networks require strict delay bounds on the transmission time of packets from source to destination. Routes for transmissions are usually statically determined, using knowledge about worst-case transmission times between nodes. This is generally a conservative method, that guarantees transmission times but does not provide any optimization for the typical case. In real networks, the typical delays vary from those considered during static route planning. The challenge in such a scenario is to minimize the total delay from a source to a destination node, while adhering to the timing constraints. For known typical and worst-case delays, an algorithm was presented to (statically) determine the policy to be followed during the... (More); Time-critical networks require strict delay bounds on the transmission time of packets from source to destination. Routes for transmissions are usually statically determined, using knowledge about worst-case transmission times between nodes. This is generally a conservative method, that guarantees transmission times but does not provide any optimization for the typical case. In real networks, the typical delays vary from those considered during static route planning. The challenge in such a scenario is to minimize the total delay from a source to a destination node, while adhering to the timing constraints. For known typical and worst-case delays, an algorithm was presented to (statically) determine the policy to be followed during the packet transmission in terms of edge choices.

In this paper we relax the assumption of knowing the typical delay, and we assume only worst-case bounds are available. We present a reinforcement learning solution to obtain optimal routing paths from a source to a destination when the typical transmission time is stochastic and unknown. Our reinforcement learning policy is based on the observation of the state-space during each packet transmission and on adaptation for future packets to congestion and unpredictable circumstances in the network. We ensure that our policy only makes safe routing decisions, thus never violating pre-determined timing constraints. We conduct experiments to evaluate the routing in a congested network and in a network where the typical delays have a large variance. Finally, we analyze the application of the algorithm to large randomly generated networks. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/06458b3e-ab7a-456a-ae53-8810cbd58bf4

author

Nayak Seetanadi, Gautham ^LU ; Maggio, Martina ^LU and Årzén, Karl-Erik ^LU

organization

publishing date

2020-06

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Control Engineering

host publication

RTNS 2020: Proceedings of the 28th International Conference on Real-Time Networks and Systems

pages

149 - 160

conference name

28th International Conference on Real-Time Networks and Systems

conference location

Paris, France

conference dates

2020-06-09 - 2020-06-11

external identifiers

scopus:85091040991

ISBN

978-1-4503-7593-1

DOI

10.1145/3394810.3394815

project

ELLIIT LU P02: Co-Design of Robust and Secure Networked Embedded Control Systems

language

English

LU publication?

yes

id

06458b3e-ab7a-456a-ae53-8810cbd58bf4

date added to LUP

2020-03-11 14:53:48

date last changed

2025-04-04 14:56:46

@misc{06458b3e-ab7a-456a-ae53-8810cbd58bf4,
  abstract     = {{Time-critical networks require strict delay bounds on the transmission time of packets from source to destination. Routes for transmissions are usually statically determined, using knowledge about worst-case transmission times between nodes. This is generally a conservative method, that guarantees transmission times but does not provide any optimization for the typical case. In real networks, the typical delays vary from those considered during static route planning. The challenge in such a scenario is to minimize the total delay from a source to a destination node, while adhering to the timing constraints. For known typical and worst-case delays, an algorithm was presented to (statically) determine the policy to be followed during the packet transmission in terms of edge choices.<br/><br/>In this paper we relax the assumption of knowing the typical delay, and we assume only worst-case bounds are available. We present a reinforcement learning solution to obtain optimal routing paths from a source to a destination when the typical transmission time is stochastic and unknown. Our reinforcement learning policy is based on the observation of the state-space during each packet transmission and on adaptation for future packets to congestion and unpredictable circumstances in the network. We ensure that our policy only makes safe routing decisions, thus never violating pre-determined timing constraints. We conduct experiments to evaluate the routing in a congested network and in a network where the typical delays have a large variance. Finally, we analyze the application of the algorithm to large randomly generated networks.}},
  author       = {{Nayak Seetanadi, Gautham and Maggio, Martina and Årzén, Karl-Erik}},
  booktitle    = {{RTNS 2020: Proceedings of the 28th International Conference on Real-Time Networks and Systems}},
  isbn         = {{978-1-4503-7593-1}},
  language     = {{eng}},
  pages        = {{149--160}},
  title        = {{Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement Learning}},
  url          = {{https://lup.lub.lu.se/search/files/77106880/main.pdf}},
  doi          = {{10.1145/3394810.3394815}},
  year         = {{2020}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement Learning