Parallelization of a multiconfigurational perturbation theory
(2013) In Journal of Computational Chemistry 34(22). p.1937-1948- Abstract
- In this work, we present a parallel approach to complete and restricted active space second-order perturbation theory, (CASPT2/RASPT2). We also make an assessment of the performance characteristics of its particular implementation in the Molcas quantum chemistry programming package. Parallel scaling is limited by memory and I/O bandwidth instead of available cores. Significant time savings for calculations on large and complex systems can be achieved by increasing the number of processes on a single machine, as long as memory bandwidth allows, or by using multiple nodes with a fast, low-latency interconnect. We found that parallel efficiency drops below 50% when using 8-16 cores on the shared-memory architecture, or 16-32 nodes on the... (More)
- In this work, we present a parallel approach to complete and restricted active space second-order perturbation theory, (CASPT2/RASPT2). We also make an assessment of the performance characteristics of its particular implementation in the Molcas quantum chemistry programming package. Parallel scaling is limited by memory and I/O bandwidth instead of available cores. Significant time savings for calculations on large and complex systems can be achieved by increasing the number of processes on a single machine, as long as memory bandwidth allows, or by using multiple nodes with a fast, low-latency interconnect. We found that parallel efficiency drops below 50% when using 8-16 cores on the shared-memory architecture, or 16-32 nodes on the distributed-memory architecture, depending on the calculation. This limits the scalability of the implementation to a moderate amount of processes. Nonetheless, calculations that took more than 3 days on a serial machine could be performed in less than 5 h on an InfiniBand cluster, where the individual nodes were not even capable of running the calculation because of memory and I/O requirements. This ensures the continuing study of larger molecular systems by means of CASPT2/RASPT2 through the use of the aggregated computational resources offered by distributed computing systems. (c) 2013 Wiley Periodicals, Inc. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/3975459
- author
- Vancoillie, Steven ; Delcey, Mickael G. ; Lindh, Roland ; Vysotskiy, Victor LU ; Malmqvist, Per-Åke LU and Veryazov, Valera LU
- organization
- publishing date
- 2013
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- parallellization, CASPT2, multiconfigurational perturbation theory, high, performance computing
- in
- Journal of Computational Chemistry
- volume
- 34
- issue
- 22
- pages
- 1937 - 1948
- publisher
- John Wiley & Sons Inc.
- external identifiers
-
- wos:000321437900009
- scopus:84880132401
- pmid:23749386
- ISSN
- 1096-987X
- DOI
- 10.1002/jcc.23342
- language
- English
- LU publication?
- yes
- additional info
- The information about affiliations in this record was updated in December 2015. The record was previously connected to the following departments: Theoretical Chemistry (S) (011001039)
- id
- 86e83b5e-1402-4d95-a6fb-bf45cd627622 (old id 3975459)
- date added to LUP
- 2016-04-01 13:54:04
- date last changed
- 2023-04-06 14:22:04
@article{86e83b5e-1402-4d95-a6fb-bf45cd627622, abstract = {{In this work, we present a parallel approach to complete and restricted active space second-order perturbation theory, (CASPT2/RASPT2). We also make an assessment of the performance characteristics of its particular implementation in the Molcas quantum chemistry programming package. Parallel scaling is limited by memory and I/O bandwidth instead of available cores. Significant time savings for calculations on large and complex systems can be achieved by increasing the number of processes on a single machine, as long as memory bandwidth allows, or by using multiple nodes with a fast, low-latency interconnect. We found that parallel efficiency drops below 50% when using 8-16 cores on the shared-memory architecture, or 16-32 nodes on the distributed-memory architecture, depending on the calculation. This limits the scalability of the implementation to a moderate amount of processes. Nonetheless, calculations that took more than 3 days on a serial machine could be performed in less than 5 h on an InfiniBand cluster, where the individual nodes were not even capable of running the calculation because of memory and I/O requirements. This ensures the continuing study of larger molecular systems by means of CASPT2/RASPT2 through the use of the aggregated computational resources offered by distributed computing systems. (c) 2013 Wiley Periodicals, Inc.}}, author = {{Vancoillie, Steven and Delcey, Mickael G. and Lindh, Roland and Vysotskiy, Victor and Malmqvist, Per-Åke and Veryazov, Valera}}, issn = {{1096-987X}}, keywords = {{parallellization; CASPT2; multiconfigurational perturbation theory; high; performance computing}}, language = {{eng}}, number = {{22}}, pages = {{1937--1948}}, publisher = {{John Wiley & Sons Inc.}}, series = {{Journal of Computational Chemistry}}, title = {{Parallelization of a multiconfigurational perturbation theory}}, url = {{http://dx.doi.org/10.1002/jcc.23342}}, doi = {{10.1002/jcc.23342}}, volume = {{34}}, year = {{2013}}, }