ASaP : Automatic Software Prefetching for Sparse Tensor Computations in MLIR
(2025) 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops In Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops p.1017-1027- Abstract
Sparse tensor computations suffer from irregular memory access patterns that degrade cache performance. While software prefetching can mitigate this, existing compiler approaches lack the semantic insight needed for effective optimization. We present ASaP, an automatic software prefetching framework integrated within MLIR’s sparse tensor dialect. By leveraging semantic information-tensor formats and loop structure-available during sparsification, ASaP determines accurate buffer bounds and injects prefetches in both innermost and outer loops, achieving broader coverage than prior work. Evaluated on SuiteSparse matrices, ASaP demonstrates significant performance gains for unstructured matrices. For SpMV with innermost-loop prefetching,... (More)
Sparse tensor computations suffer from irregular memory access patterns that degrade cache performance. While software prefetching can mitigate this, existing compiler approaches lack the semantic insight needed for effective optimization. We present ASaP, an automatic software prefetching framework integrated within MLIR’s sparse tensor dialect. By leveraging semantic information-tensor formats and loop structure-available during sparsification, ASaP determines accurate buffer bounds and injects prefetches in both innermost and outer loops, achieving broader coverage than prior work. Evaluated on SuiteSparse matrices, ASaP demonstrates significant performance gains for unstructured matrices. For SpMV with innermost-loop prefetching, ASaP achieves 1.38× speedup over Ainsworth & Jones. For SpMM with outer-loop prefetching, ASaP achieves 1.28× speedup while Ainsworth & Jones fails to generate prefetches. Our experiments reveal that disabling inaccurate hardware prefetchers frees critical resources for software prefetching, suggesting future architectures should expose prefetcher control as an optimization interface.
(Less)
- author
- Sotiropoulos, Konstantinos
; Skeppstedt, Jonas
LU
and Stenström, Per
LU
- organization
- publishing date
- 2025-11-15
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- software prefetching, sparse data structures, sparse tensors
- host publication
- Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
- series title
- Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
- pages
- 11 pages
- publisher
- Association for Computing Machinery (ACM)
- conference name
- 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
- conference location
- St. Louis, United States
- conference dates
- 2025-11-16 - 2025-11-21
- external identifiers
-
- scopus:105023391502
- ISBN
- 9798400718717
- DOI
- 10.1145/3731599.3767477
- language
- English
- LU publication?
- yes
- additional info
- Publisher Copyright: © 2025 Copyright held by the owner/author(s).
- id
- 03e045f9-da7e-4f34-b15a-446db55bbd5d
- date added to LUP
- 2026-01-22 10:37:13
- date last changed
- 2026-01-22 10:37:30
@inproceedings{03e045f9-da7e-4f34-b15a-446db55bbd5d,
abstract = {{<p>Sparse tensor computations suffer from irregular memory access patterns that degrade cache performance. While software prefetching can mitigate this, existing compiler approaches lack the semantic insight needed for effective optimization. We present ASaP, an automatic software prefetching framework integrated within MLIR’s sparse tensor dialect. By leveraging semantic information-tensor formats and loop structure-available during sparsification, ASaP determines accurate buffer bounds and injects prefetches in both innermost and outer loops, achieving broader coverage than prior work. Evaluated on SuiteSparse matrices, ASaP demonstrates significant performance gains for unstructured matrices. For SpMV with innermost-loop prefetching, ASaP achieves 1.38× speedup over Ainsworth & Jones. For SpMM with outer-loop prefetching, ASaP achieves 1.28× speedup while Ainsworth & Jones fails to generate prefetches. Our experiments reveal that disabling inaccurate hardware prefetchers frees critical resources for software prefetching, suggesting future architectures should expose prefetcher control as an optimization interface.</p>}},
author = {{Sotiropoulos, Konstantinos and Skeppstedt, Jonas and Stenström, Per}},
booktitle = {{Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops}},
isbn = {{9798400718717}},
keywords = {{software prefetching; sparse data structures; sparse tensors}},
language = {{eng}},
month = {{11}},
pages = {{1017--1027}},
publisher = {{Association for Computing Machinery (ACM)}},
series = {{Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops}},
title = {{ASaP : Automatic Software Prefetching for Sparse Tensor Computations in MLIR}},
url = {{http://dx.doi.org/10.1145/3731599.3767477}},
doi = {{10.1145/3731599.3767477}},
year = {{2025}},
}