Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Integration of Nordugrid ARC with Galaxy and EGI IM

Pedersen, Maiken ; Konya, Balazs LU ; Luna-Valero, Sebastian and Grüning, Björn (2025) 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024 In Communications in Computer and Information Science 2398 CCIS. p.61-78
Abstract

In the scientific domain of high energy physics (HEP), the Worldwide LHC Computing Grid (WLCG) was created in order to handle the huge compute and storage needs of the experiments at Large Hadron Collider (LHC). Today WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries. What ties the sites together is the middleware installed at each site, one of these being the Nordugrid ARC (Advanced Resource Connector). ARC has been a great success and has served, and continues to serve the HEP community very well. Up until now though, we have had limited success in sharing our technology with other communities, despite the fact that many are faced with challenges that ARC solves: managing... (More)

In the scientific domain of high energy physics (HEP), the Worldwide LHC Computing Grid (WLCG) was created in order to handle the huge compute and storage needs of the experiments at Large Hadron Collider (LHC). Today WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries. What ties the sites together is the middleware installed at each site, one of these being the Nordugrid ARC (Advanced Resource Connector). ARC has been a great success and has served, and continues to serve the HEP community very well. Up until now though, we have had limited success in sharing our technology with other communities, despite the fact that many are faced with challenges that ARC solves: managing computation and storage across different infrastructure providers. With a network of ARC enabled compute sites - a user can submit a job from “anywhere” and automatically be routed to the best site depending on various matchmaking rules. One of the key strengths of ARC is its inbuilt data handling capabilities. ARC seamlessly downloads any remote input data to the computing site and makes sure all data is in place before the job is passed to the site’s local batch system. Once the job is done ARC can upload the data to a remote storage site, or it can be manually retrieved. In this paper we describe how we have integrated ARC with the Galaxy Project portal in the context of the EuroScienceGateway project. The Galaxy portal is a user-friendly job-submission and workflow platform that lets a user easily define and submit jobs to an underlying computing cluster, it allows reproducibility in addition to facilitating sharing of jobs and workflows. The Galaxy project has a large user-base from the bioinformatics communities, in addition to users from the climate, astrophysics and material science communities, to mention a few. With ARC integration in Galaxy, these new communities will seamlessly be able to enjoy the benefits of ARC by using Galaxy to submit jobs to their remote HPC system, instead of having to manually log into the HPC system and interact with the local batch system via scripting. We also present the ongoing work to make ARC available via the European Grid Infrastructure (EGI) Infrastructure Manager.

(Less)
Please use this url to cite or link to this publication:
author
; ; and
organization
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
Cloud, Distributed computing, EGI, EuroScienceGateway Project, Galaxy Project, Grid, HPC, Middleware, Nordugrid ARC, Storage and compute
host publication
Nordic e-Infrastructure Tomorrow - 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024, Proceedings
series title
Communications in Computer and Information Science
editor
Azab, Abdulrahman and Malkiewicz, Tomasz
volume
2398 CCIS
pages
18 pages
publisher
Springer Science and Business Media B.V.
conference name
6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024
conference location
Tallinn, Estonia
conference dates
2024-05-27 - 2024-05-29
external identifiers
  • scopus:105002585811
ISSN
1865-0929
1865-0937
ISBN
9783031862397
DOI
10.1007/978-3-031-86240-3_5
language
English
LU publication?
yes
id
753af4cf-5f09-4a58-a294-364c6d6a7c45
date added to LUP
2025-08-29 12:25:55
date last changed
2025-08-29 12:50:53
@inproceedings{753af4cf-5f09-4a58-a294-364c6d6a7c45,
  abstract     = {{<p>In the scientific domain of high energy physics (HEP), the Worldwide LHC Computing Grid (WLCG) was created in order to handle the huge compute and storage needs of the experiments at Large Hadron Collider (LHC). Today WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries. What ties the sites together is the middleware installed at each site, one of these being the Nordugrid ARC (Advanced Resource Connector). ARC has been a great success and has served, and continues to serve the HEP community very well. Up until now though, we have had limited success in sharing our technology with other communities, despite the fact that many are faced with challenges that ARC solves: managing computation and storage across different infrastructure providers. With a network of ARC enabled compute sites - a user can submit a job from “anywhere” and automatically be routed to the best site depending on various matchmaking rules. One of the key strengths of ARC is its inbuilt data handling capabilities. ARC seamlessly downloads any remote input data to the computing site and makes sure all data is in place before the job is passed to the site’s local batch system. Once the job is done ARC can upload the data to a remote storage site, or it can be manually retrieved. In this paper we describe how we have integrated ARC with the Galaxy Project portal in the context of the EuroScienceGateway project. The Galaxy portal is a user-friendly job-submission and workflow platform that lets a user easily define and submit jobs to an underlying computing cluster, it allows reproducibility in addition to facilitating sharing of jobs and workflows. The Galaxy project has a large user-base from the bioinformatics communities, in addition to users from the climate, astrophysics and material science communities, to mention a few. With ARC integration in Galaxy, these new communities will seamlessly be able to enjoy the benefits of ARC by using Galaxy to submit jobs to their remote HPC system, instead of having to manually log into the HPC system and interact with the local batch system via scripting. We also present the ongoing work to make ARC available via the European Grid Infrastructure (EGI) Infrastructure Manager.</p>}},
  author       = {{Pedersen, Maiken and Konya, Balazs and Luna-Valero, Sebastian and Grüning, Björn}},
  booktitle    = {{Nordic e-Infrastructure Tomorrow - 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024, Proceedings}},
  editor       = {{Azab, Abdulrahman and Malkiewicz, Tomasz}},
  isbn         = {{9783031862397}},
  issn         = {{1865-0929}},
  keywords     = {{Cloud; Distributed computing; EGI; EuroScienceGateway Project; Galaxy Project; Grid; HPC; Middleware; Nordugrid ARC; Storage and compute}},
  language     = {{eng}},
  pages        = {{61--78}},
  publisher    = {{Springer Science and Business Media B.V.}},
  series       = {{Communications in Computer and Information Science}},
  title        = {{Integration of Nordugrid ARC with Galaxy and EGI IM}},
  url          = {{http://dx.doi.org/10.1007/978-3-031-86240-3_5}},
  doi          = {{10.1007/978-3-031-86240-3_5}},
  volume       = {{2398 CCIS}},
  year         = {{2025}},
}