Advanced

Investigating Different Register Allocation Techniques for a GPU Compiler

Andersson, Max LU (2016) In LU-CS-EX 2016-23 EDA920 20161
Department of Computer Science
Abstract
Register allocation is one of the most critical parts of an optimizing com-
piler. Although a great effort has been put into researching how to allocate
registers, not much of it has been focused on vector registers. This report
seeks to find out what fundamentally new problems arise when allocating vec-
tor registers rather than scalar registers, how the previously known problems
change in vector registers and what methods can be used to tackle these issues.
Furthermore, an attempt to use a combined scheme of register allocation and
instruction scheduling is made, to see how it performs with vector registers.
This thesis presents the results of an investigation of how two commonly used
register allocation techniques, Linear scan... (More)
Register allocation is one of the most critical parts of an optimizing com-
piler. Although a great effort has been put into researching how to allocate
registers, not much of it has been focused on vector registers. This report
seeks to find out what fundamentally new problems arise when allocating vec-
tor registers rather than scalar registers, how the previously known problems
change in vector registers and what methods can be used to tackle these issues.
Furthermore, an attempt to use a combined scheme of register allocation and
instruction scheduling is made, to see how it performs with vector registers.
This thesis presents the results of an investigation of how two commonly used
register allocation techniques, Linear scan and Graph coloring, perform rela-
tively. Furthermore, it presents generalizations to a commonly used algorithm
used in graph coloring, Chaitin’s algorithm.

Using the internal performance suites of ARM Midgard compiler, our in-
vestigation revealed that linear scan can in fact speed up code generation quite
significantly, up to 12.5% compared to graph coloring. However, this comes
at the cost of reduced code quality. The generalized spill criterion resulted in
an significant reduction in spill code inserted, where 10% less spill code was
inserted using the derived criterion. This however did not equate to 10% re-
duction in execution time, although execution time of the generated code was
overall decreased by 0.5%. The combined scheme reached comparable effi-
ciency compared to graph coloring, however, since it was only used for single
basic block shaders, it is difficult to say how efficient the register allocation
would be for larger shaders. (Less)
Please use this url to cite or link to this publication:
author
Andersson, Max LU
supervisor
organization
course
EDA920 20161
year
type
H3 - Professional qualifications (4 Years - )
subject
keywords
Graph coloring, Register allocation, Vector registers, Compiler optimization
publication/series
LU-CS-EX 2016-23
report number
LU-CS-EX 2016-23
ISSN
1650-2884
language
English
id
8884733
date added to LUP
2016-06-23 10:59:52
date last changed
2016-12-15 14:41:48
@misc{8884733,
  abstract     = {Register allocation is one of the most critical parts of an optimizing com-
piler. Although a great effort has been put into researching how to allocate
registers, not much of it has been focused on vector registers. This report
seeks to find out what fundamentally new problems arise when allocating vec-
tor registers rather than scalar registers, how the previously known problems
change in vector registers and what methods can be used to tackle these issues.
Furthermore, an attempt to use a combined scheme of register allocation and
instruction scheduling is made, to see how it performs with vector registers.
This thesis presents the results of an investigation of how two commonly used
register allocation techniques, Linear scan and Graph coloring, perform rela-
tively. Furthermore, it presents generalizations to a commonly used algorithm
used in graph coloring, Chaitin’s algorithm.

Using the internal performance suites of ARM Midgard compiler, our in-
vestigation revealed that linear scan can in fact speed up code generation quite
significantly, up to 12.5% compared to graph coloring. However, this comes
at the cost of reduced code quality. The generalized spill criterion resulted in
an significant reduction in spill code inserted, where 10% less spill code was
inserted using the derived criterion. This however did not equate to 10% re-
duction in execution time, although execution time of the generated code was
overall decreased by 0.5%. The combined scheme reached comparable effi-
ciency compared to graph coloring, however, since it was only used for single
basic block shaders, it is difficult to say how efficient the register allocation
would be for larger shaders.},
  author       = {Andersson, Max},
  issn         = {1650-2884},
  keyword      = {Graph coloring,Register allocation,Vector registers,Compiler optimization},
  language     = {eng},
  note         = {Student Paper},
  series       = {LU-CS-EX 2016-23},
  title        = {Investigating Different Register Allocation Techniques for a GPU Compiler},
  year         = {2016},
}