Integration of Nordugrid ARC with Galaxy and EGI IM
(2025) 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024 In Communications in Computer and Information Science 2398 CCIS. p.61-78- Abstract
In the scientific domain of high energy physics (HEP), the Worldwide LHC Computing Grid (WLCG) was created in order to handle the huge compute and storage needs of the experiments at Large Hadron Collider (LHC). Today WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries. What ties the sites together is the middleware installed at each site, one of these being the Nordugrid ARC (Advanced Resource Connector). ARC has been a great success and has served, and continues to serve the HEP community very well. Up until now though, we have had limited success in sharing our technology with other communities, despite the fact that many are faced with challenges that ARC solves: managing... (More)
In the scientific domain of high energy physics (HEP), the Worldwide LHC Computing Grid (WLCG) was created in order to handle the huge compute and storage needs of the experiments at Large Hadron Collider (LHC). Today WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries. What ties the sites together is the middleware installed at each site, one of these being the Nordugrid ARC (Advanced Resource Connector). ARC has been a great success and has served, and continues to serve the HEP community very well. Up until now though, we have had limited success in sharing our technology with other communities, despite the fact that many are faced with challenges that ARC solves: managing computation and storage across different infrastructure providers. With a network of ARC enabled compute sites - a user can submit a job from “anywhere” and automatically be routed to the best site depending on various matchmaking rules. One of the key strengths of ARC is its inbuilt data handling capabilities. ARC seamlessly downloads any remote input data to the computing site and makes sure all data is in place before the job is passed to the site’s local batch system. Once the job is done ARC can upload the data to a remote storage site, or it can be manually retrieved. In this paper we describe how we have integrated ARC with the Galaxy Project portal in the context of the EuroScienceGateway project. The Galaxy portal is a user-friendly job-submission and workflow platform that lets a user easily define and submit jobs to an underlying computing cluster, it allows reproducibility in addition to facilitating sharing of jobs and workflows. The Galaxy project has a large user-base from the bioinformatics communities, in addition to users from the climate, astrophysics and material science communities, to mention a few. With ARC integration in Galaxy, these new communities will seamlessly be able to enjoy the benefits of ARC by using Galaxy to submit jobs to their remote HPC system, instead of having to manually log into the HPC system and interact with the local batch system via scripting. We also present the ongoing work to make ARC available via the European Grid Infrastructure (EGI) Infrastructure Manager.
(Less)
- author
- Pedersen, Maiken ; Konya, Balazs LU ; Luna-Valero, Sebastian and Grüning, Björn
- organization
- publishing date
- 2025
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- Cloud, Distributed computing, EGI, EuroScienceGateway Project, Galaxy Project, Grid, HPC, Middleware, Nordugrid ARC, Storage and compute
- host publication
- Nordic e-Infrastructure Tomorrow - 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024, Proceedings
- series title
- Communications in Computer and Information Science
- editor
- Azab, Abdulrahman and Malkiewicz, Tomasz
- volume
- 2398 CCIS
- pages
- 18 pages
- publisher
- Springer Science and Business Media B.V.
- conference name
- 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024
- conference location
- Tallinn, Estonia
- conference dates
- 2024-05-27 - 2024-05-29
- external identifiers
-
- scopus:105002585811
- ISSN
- 1865-0929
- 1865-0937
- ISBN
- 9783031862397
- DOI
- 10.1007/978-3-031-86240-3_5
- language
- English
- LU publication?
- yes
- id
- 753af4cf-5f09-4a58-a294-364c6d6a7c45
- date added to LUP
- 2025-08-29 12:25:55
- date last changed
- 2025-08-29 12:50:53
@inproceedings{753af4cf-5f09-4a58-a294-364c6d6a7c45, abstract = {{<p>In the scientific domain of high energy physics (HEP), the Worldwide LHC Computing Grid (WLCG) was created in order to handle the huge compute and storage needs of the experiments at Large Hadron Collider (LHC). Today WLCG combines about 1.4 million computer cores and 1.5 exabytes of storage from over 170 sites in 42 countries. What ties the sites together is the middleware installed at each site, one of these being the Nordugrid ARC (Advanced Resource Connector). ARC has been a great success and has served, and continues to serve the HEP community very well. Up until now though, we have had limited success in sharing our technology with other communities, despite the fact that many are faced with challenges that ARC solves: managing computation and storage across different infrastructure providers. With a network of ARC enabled compute sites - a user can submit a job from “anywhere” and automatically be routed to the best site depending on various matchmaking rules. One of the key strengths of ARC is its inbuilt data handling capabilities. ARC seamlessly downloads any remote input data to the computing site and makes sure all data is in place before the job is passed to the site’s local batch system. Once the job is done ARC can upload the data to a remote storage site, or it can be manually retrieved. In this paper we describe how we have integrated ARC with the Galaxy Project portal in the context of the EuroScienceGateway project. The Galaxy portal is a user-friendly job-submission and workflow platform that lets a user easily define and submit jobs to an underlying computing cluster, it allows reproducibility in addition to facilitating sharing of jobs and workflows. The Galaxy project has a large user-base from the bioinformatics communities, in addition to users from the climate, astrophysics and material science communities, to mention a few. With ARC integration in Galaxy, these new communities will seamlessly be able to enjoy the benefits of ARC by using Galaxy to submit jobs to their remote HPC system, instead of having to manually log into the HPC system and interact with the local batch system via scripting. We also present the ongoing work to make ARC available via the European Grid Infrastructure (EGI) Infrastructure Manager.</p>}}, author = {{Pedersen, Maiken and Konya, Balazs and Luna-Valero, Sebastian and Grüning, Björn}}, booktitle = {{Nordic e-Infrastructure Tomorrow - 6th Nordic e-Infrastructure Collaboration Conference, NeIC 2024, Proceedings}}, editor = {{Azab, Abdulrahman and Malkiewicz, Tomasz}}, isbn = {{9783031862397}}, issn = {{1865-0929}}, keywords = {{Cloud; Distributed computing; EGI; EuroScienceGateway Project; Galaxy Project; Grid; HPC; Middleware; Nordugrid ARC; Storage and compute}}, language = {{eng}}, pages = {{61--78}}, publisher = {{Springer Science and Business Media B.V.}}, series = {{Communications in Computer and Information Science}}, title = {{Integration of Nordugrid ARC with Galaxy and EGI IM}}, url = {{http://dx.doi.org/10.1007/978-3-031-86240-3_5}}, doi = {{10.1007/978-3-031-86240-3_5}}, volume = {{2398 CCIS}}, year = {{2025}}, }