Advanced

Optimising Performance through Unbalanced Decompositions

Jackson, Adrian; Hein, Joachim LU and Roach, Colin (2015) In IEEE Transactions on Parallel and Distributed Systems 26(10). p.2863-2873
Abstract
When significant communication costs arise in the solution of multidimensional problems on parallel computers, optimal performance cannot always be achieved by perfectly balancing the computational load across cores. Modest sacrifices in the computational load balance may facilitate substantial overall performance improvements by achieving large savings in the costs associated with communications. This general approach is illustrated by application to GS2, an initial value gyrokinetic simulation code developed to study low-frequency turbulence in magnetized plasma. GS2 is parallelised using MPI with the simulation domain decomposed across tasks. The optimal domain decomposition is non-trivial, and is complicated by the fact that several... (More)
When significant communication costs arise in the solution of multidimensional problems on parallel computers, optimal performance cannot always be achieved by perfectly balancing the computational load across cores. Modest sacrifices in the computational load balance may facilitate substantial overall performance improvements by achieving large savings in the costs associated with communications. This general approach is illustrated by application to GS2, an initial value gyrokinetic simulation code developed to study low-frequency turbulence in magnetized plasma. GS2 is parallelised using MPI with the simulation domain decomposed across tasks. The optimal domain decomposition is non-trivial, and is complicated by the fact that several domain decompositions are needed and that these do not all optimise at the chosen task count. Application to GS2, of the novel approach outlined in this paper, has improved performance by up to 17 percent for a representative simulation. Similar strategies may be beneficial in a broader class of problems. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Distributed, parallel algorithms, applications, nonlinear programming, linear programming, physics
in
IEEE Transactions on Parallel and Distributed Systems
volume
26
issue
10
pages
2863 - 2873
publisher
IEEE--Institute of Electrical and Electronics Engineers Inc.
external identifiers
  • wos:000362791400019
  • scopus:84961848432
ISSN
1045-9219
DOI
10.1109/TPDS.2014.2351826
language
English
LU publication?
yes
id
10271ce5-0ca8-4111-8303-b2195368c350 (old id 8205973)
date added to LUP
2015-11-26 14:29:53
date last changed
2017-01-01 05:38:43
@article{10271ce5-0ca8-4111-8303-b2195368c350,
  abstract     = {When significant communication costs arise in the solution of multidimensional problems on parallel computers, optimal performance cannot always be achieved by perfectly balancing the computational load across cores. Modest sacrifices in the computational load balance may facilitate substantial overall performance improvements by achieving large savings in the costs associated with communications. This general approach is illustrated by application to GS2, an initial value gyrokinetic simulation code developed to study low-frequency turbulence in magnetized plasma. GS2 is parallelised using MPI with the simulation domain decomposed across tasks. The optimal domain decomposition is non-trivial, and is complicated by the fact that several domain decompositions are needed and that these do not all optimise at the chosen task count. Application to GS2, of the novel approach outlined in this paper, has improved performance by up to 17 percent for a representative simulation. Similar strategies may be beneficial in a broader class of problems.},
  author       = {Jackson, Adrian and Hein, Joachim and Roach, Colin},
  issn         = {1045-9219},
  keyword      = {Distributed,parallel algorithms,applications,nonlinear programming,linear programming,physics},
  language     = {eng},
  number       = {10},
  pages        = {2863--2873},
  publisher    = {IEEE--Institute of Electrical and Electronics Engineers Inc.},
  series       = {IEEE Transactions on Parallel and Distributed Systems},
  title        = {Optimising Performance through Unbalanced Decompositions},
  url          = {http://dx.doi.org/10.1109/TPDS.2014.2351826},
  volume       = {26},
  year         = {2015},
}