Advanced

Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement Learning

Nayak Seetanadi, Gautham LU ; Maggio, Martina LU and Årzén, Karl-Erik LU (2020) 28th International Conference on Real-Time Networks and Systems p.149-160
Abstract
Time-critical networks require strict delay bounds on the transmission time of packets from source to destination. Routes for transmissions are usually statically determined, using knowledge about worst-case transmission times between nodes. This is generally a conservative method, that guarantees transmission times but does not provide any optimization for the typical case. In real networks, the typical delays vary from those considered during static route planning. The challenge in such a scenario is to minimize the total delay from a source to a destination node, while adhering to the timing constraints. For known typical and worst-case delays, an algorithm was presented to (statically) determine the policy to be followed during the... (More)
Time-critical networks require strict delay bounds on the transmission time of packets from source to destination. Routes for transmissions are usually statically determined, using knowledge about worst-case transmission times between nodes. This is generally a conservative method, that guarantees transmission times but does not provide any optimization for the typical case. In real networks, the typical delays vary from those considered during static route planning. The challenge in such a scenario is to minimize the total delay from a source to a destination node, while adhering to the timing constraints. For known typical and worst-case delays, an algorithm was presented to (statically) determine the policy to be followed during the packet transmission in terms of edge choices.

In this paper we relax the assumption of knowing the typical delay, and we assume only worst-case bounds are available. We present a reinforcement learning solution to obtain optimal routing paths from a source to a destination when the typical transmission time is stochastic and unknown. Our reinforcement learning policy is based on the observation of the state-space during each packet transmission and on adaptation for future packets to congestion and unpredictable circumstances in the network. We ensure that our policy only makes safe routing decisions, thus never violating pre-determined timing constraints. We conduct experiments to evaluate the routing in a congested network and in a network where the typical delays have a large variance. Finally, we analyze the application of the algorithm to large randomly generated networks. (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
RTNS 2020: Proceedings of the 28th International Conference on Real-Time Networks and Systems
pages
149 - 160
conference name
28th International Conference on Real-Time Networks and Systems
conference location
Paris, France
conference dates
2020-06-09 - 2020-06-11
external identifiers
  • scopus:85091040991
ISBN
978-1-4503-7593-1
DOI
10.1145/3394810.3394815
project
ELLIIT LU P02: Co-Design of Robust and Secure Networked Embedded Control Systems
language
English
LU publication?
yes
id
06458b3e-ab7a-456a-ae53-8810cbd58bf4
date added to LUP
2020-03-11 14:53:48
date last changed
2021-01-06 07:06:57
@misc{06458b3e-ab7a-456a-ae53-8810cbd58bf4,
  abstract     = {Time-critical networks require strict delay bounds on the transmission time of packets from source to destination. Routes for transmissions are usually statically determined, using knowledge about worst-case transmission times between nodes. This is generally a conservative method, that guarantees transmission times but does not provide any optimization for the typical case. In real networks, the typical delays vary from those considered during static route planning. The challenge in such a scenario is to minimize the total delay from a source to a destination node, while adhering to the timing constraints. For known typical and worst-case delays, an algorithm was presented to (statically) determine the policy to be followed during the packet transmission in terms of edge choices.<br/><br/>In this paper we relax the assumption of knowing the typical delay, and we assume only worst-case bounds are available. We present a reinforcement learning solution to obtain optimal routing paths from a source to a destination when the typical transmission time is stochastic and unknown. Our reinforcement learning policy is based on the observation of the state-space during each packet transmission and on adaptation for future packets to congestion and unpredictable circumstances in the network. We ensure that our policy only makes safe routing decisions, thus never violating pre-determined timing constraints. We conduct experiments to evaluate the routing in a congested network and in a network where the typical delays have a large variance. Finally, we analyze the application of the algorithm to large randomly generated networks.},
  author       = {Nayak Seetanadi, Gautham and Maggio, Martina and Årzén, Karl-Erik},
  booktitle    = {RTNS 2020: Proceedings of the 28th International Conference on Real-Time Networks and Systems},
  isbn         = {978-1-4503-7593-1},
  language     = {eng},
  pages        = {149--160},
  title        = {Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement Learning},
  url          = {https://lup.lub.lu.se/search/ws/files/77106880/main.pdf},
  doi          = {10.1145/3394810.3394815},
  year         = {2020},
}