Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Parallelization of a multiconfigurational perturbation theory

Vancoillie, Steven ; Delcey, Mickael G. ; Lindh, Roland ; Vysotskiy, Victor LU ; Malmqvist, Per-Åke LU and Veryazov, Valera LU orcid (2013) In Journal of Computational Chemistry 34(22). p.1937-1948
Abstract
In this work, we present a parallel approach to complete and restricted active space second-order perturbation theory, (CASPT2/RASPT2). We also make an assessment of the performance characteristics of its particular implementation in the Molcas quantum chemistry programming package. Parallel scaling is limited by memory and I/O bandwidth instead of available cores. Significant time savings for calculations on large and complex systems can be achieved by increasing the number of processes on a single machine, as long as memory bandwidth allows, or by using multiple nodes with a fast, low-latency interconnect. We found that parallel efficiency drops below 50% when using 8-16 cores on the shared-memory architecture, or 16-32 nodes on the... (More)
In this work, we present a parallel approach to complete and restricted active space second-order perturbation theory, (CASPT2/RASPT2). We also make an assessment of the performance characteristics of its particular implementation in the Molcas quantum chemistry programming package. Parallel scaling is limited by memory and I/O bandwidth instead of available cores. Significant time savings for calculations on large and complex systems can be achieved by increasing the number of processes on a single machine, as long as memory bandwidth allows, or by using multiple nodes with a fast, low-latency interconnect. We found that parallel efficiency drops below 50% when using 8-16 cores on the shared-memory architecture, or 16-32 nodes on the distributed-memory architecture, depending on the calculation. This limits the scalability of the implementation to a moderate amount of processes. Nonetheless, calculations that took more than 3 days on a serial machine could be performed in less than 5 h on an InfiniBand cluster, where the individual nodes were not even capable of running the calculation because of memory and I/O requirements. This ensures the continuing study of larger molecular systems by means of CASPT2/RASPT2 through the use of the aggregated computational resources offered by distributed computing systems. (c) 2013 Wiley Periodicals, Inc. (Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
parallellization, CASPT2, multiconfigurational perturbation theory, high, performance computing
in
Journal of Computational Chemistry
volume
34
issue
22
pages
1937 - 1948
publisher
John Wiley & Sons Inc.
external identifiers
  • wos:000321437900009
  • scopus:84880132401
  • pmid:23749386
ISSN
1096-987X
DOI
10.1002/jcc.23342
language
English
LU publication?
yes
additional info
The information about affiliations in this record was updated in December 2015. The record was previously connected to the following departments: Theoretical Chemistry (S) (011001039)
id
86e83b5e-1402-4d95-a6fb-bf45cd627622 (old id 3975459)
date added to LUP
2016-04-01 13:54:04
date last changed
2023-04-06 14:22:04
@article{86e83b5e-1402-4d95-a6fb-bf45cd627622,
  abstract     = {{In this work, we present a parallel approach to complete and restricted active space second-order perturbation theory, (CASPT2/RASPT2). We also make an assessment of the performance characteristics of its particular implementation in the Molcas quantum chemistry programming package. Parallel scaling is limited by memory and I/O bandwidth instead of available cores. Significant time savings for calculations on large and complex systems can be achieved by increasing the number of processes on a single machine, as long as memory bandwidth allows, or by using multiple nodes with a fast, low-latency interconnect. We found that parallel efficiency drops below 50% when using 8-16 cores on the shared-memory architecture, or 16-32 nodes on the distributed-memory architecture, depending on the calculation. This limits the scalability of the implementation to a moderate amount of processes. Nonetheless, calculations that took more than 3 days on a serial machine could be performed in less than 5 h on an InfiniBand cluster, where the individual nodes were not even capable of running the calculation because of memory and I/O requirements. This ensures the continuing study of larger molecular systems by means of CASPT2/RASPT2 through the use of the aggregated computational resources offered by distributed computing systems. (c) 2013 Wiley Periodicals, Inc.}},
  author       = {{Vancoillie, Steven and Delcey, Mickael G. and Lindh, Roland and Vysotskiy, Victor and Malmqvist, Per-Åke and Veryazov, Valera}},
  issn         = {{1096-987X}},
  keywords     = {{parallellization; CASPT2; multiconfigurational perturbation theory; high; performance computing}},
  language     = {{eng}},
  number       = {{22}},
  pages        = {{1937--1948}},
  publisher    = {{John Wiley & Sons Inc.}},
  series       = {{Journal of Computational Chemistry}},
  title        = {{Parallelization of a multiconfigurational perturbation theory}},
  url          = {{http://dx.doi.org/10.1002/jcc.23342}},
  doi          = {{10.1002/jcc.23342}},
  volume       = {{34}},
  year         = {{2013}},
}