Investigating Different Register Allocation Techniques for a GPU Compiler

Andersson, Max

Investigating Different Register Allocation Techniques for a GPU Compiler

Mark

Andersson, Max ^LU (2016) In LU-CS-EX 2016-23 EDA920 20161
Department of Computer Science

Abstract: Register allocation is one of the most critical parts of an optimizing com-
piler. Although a great effort has been put into researching how to allocate
registers, not much of it has been focused on vector registers. This report
seeks to find out what fundamentally new problems arise when allocating vec-
tor registers rather than scalar registers, how the previously known problems
change in vector registers and what methods can be used to tackle these issues.
Furthermore, an attempt to use a combined scheme of register allocation and
instruction scheduling is made, to see how it performs with vector registers.
This thesis presents the results of an investigation of how two commonly used
register allocation techniques, Linear scan... (More); Register allocation is one of the most critical parts of an optimizing com-
piler. Although a great effort has been put into researching how to allocate
registers, not much of it has been focused on vector registers. This report
seeks to find out what fundamentally new problems arise when allocating vec-
tor registers rather than scalar registers, how the previously known problems
change in vector registers and what methods can be used to tackle these issues.
Furthermore, an attempt to use a combined scheme of register allocation and
instruction scheduling is made, to see how it performs with vector registers.
This thesis presents the results of an investigation of how two commonly used
register allocation techniques, Linear scan and Graph coloring, perform rela-
tively. Furthermore, it presents generalizations to a commonly used algorithm
used in graph coloring, Chaitin’s algorithm.

Using the internal performance suites of ARM Midgard compiler, our in-
vestigation revealed that linear scan can in fact speed up code generation quite
significantly, up to 12.5% compared to graph coloring. However, this comes
at the cost of reduced code quality. The generalized spill criterion resulted in
an significant reduction in spill code inserted, where 10% less spill code was
inserted using the derived criterion. This however did not equate to 10% re-
duction in execution time, although execution time of the generated code was
overall decreased by 0.5%. The combined scheme reached comparable effi-
ciency compared to graph coloring, however, since it was only used for single
basic block shaders, it is difficult to say how efficient the register allocation
would be for larger shaders. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/8884733

author

Andersson, Max ^LU

supervisor

Jonas Skeppstedt ^LU
Dmitri Ivanov

organization

Department of Computer Science

course

EDA920 20161

year

2016

type

H3 - Professional qualifications (4 Years - )

subject

Technology and Engineering

keywords

Graph coloring, Register allocation, Vector registers, Compiler optimization

publication/series

LU-CS-EX 2016-23

report number

LU-CS-EX 2016-23

ISSN

1650-2884

language

English

id

8884733

date added to LUP

2016-06-23 10:59:52

date last changed

2016-12-15 14:41:48

@misc{8884733,
  abstract     = {{Register allocation is one of the most critical parts of an optimizing com-
piler. Although a great effort has been put into researching how to allocate
registers, not much of it has been focused on vector registers. This report
seeks to find out what fundamentally new problems arise when allocating vec-
tor registers rather than scalar registers, how the previously known problems
change in vector registers and what methods can be used to tackle these issues.
Furthermore, an attempt to use a combined scheme of register allocation and
instruction scheduling is made, to see how it performs with vector registers.
This thesis presents the results of an investigation of how two commonly used
register allocation techniques, Linear scan and Graph coloring, perform rela-
tively. Furthermore, it presents generalizations to a commonly used algorithm
used in graph coloring, Chaitin’s algorithm.

Using the internal performance suites of ARM Midgard compiler, our in-
vestigation revealed that linear scan can in fact speed up code generation quite
significantly, up to 12.5% compared to graph coloring. However, this comes
at the cost of reduced code quality. The generalized spill criterion resulted in
an significant reduction in spill code inserted, where 10% less spill code was
inserted using the derived criterion. This however did not equate to 10% re-
duction in execution time, although execution time of the generated code was
overall decreased by 0.5%. The combined scheme reached comparable effi-
ciency compared to graph coloring, however, since it was only used for single
basic block shaders, it is difficult to say how efficient the register allocation
would be for larger shaders.}},
  author       = {{Andersson, Max}},
  issn         = {{1650-2884}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{LU-CS-EX 2016-23}},
  title        = {{Investigating Different Register Allocation Techniques for a GPU Compiler}},
  year         = {{2016}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Investigating Different Register Allocation Techniques for a GPU Compiler