ASaP : Automatic Software Prefetching for Sparse Tensor Computations in MLIR

Sotiropoulos, Konstantinos; Skeppstedt, Jonas; Stenström, Per

ASaP : Automatic Software Prefetching for Sparse Tensor Computations in MLIR

Mark

Sotiropoulos, Konstantinos ; Skeppstedt, Jonas ^LU

and Stenström, Per ^LU (2025) 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops In Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops p.1017-1027

Abstract: Sparse tensor computations suffer from irregular memory access patterns that degrade cache performance. While software prefetching can mitigate this, existing compiler approaches lack the semantic insight needed for effective optimization. We present ASaP, an automatic software prefetching framework integrated within MLIR’s sparse tensor dialect. By leveraging semantic information-tensor formats and loop structure-available during sparsification, ASaP determines accurate buffer bounds and injects prefetches in both innermost and outer loops, achieving broader coverage than prior work. Evaluated on SuiteSparse matrices, ASaP demonstrates significant performance gains for unstructured matrices. For SpMV with innermost-loop prefetching,... (More); Sparse tensor computations suffer from irregular memory access patterns that degrade cache performance. While software prefetching can mitigate this, existing compiler approaches lack the semantic insight needed for effective optimization. We present ASaP, an automatic software prefetching framework integrated within MLIR’s sparse tensor dialect. By leveraging semantic information-tensor formats and loop structure-available during sparsification, ASaP determines accurate buffer bounds and injects prefetches in both innermost and outer loops, achieving broader coverage than prior work. Evaluated on SuiteSparse matrices, ASaP demonstrates significant performance gains for unstructured matrices. For SpMV with innermost-loop prefetching, ASaP achieves 1.38× speedup over Ainsworth & Jones. For SpMM with outer-loop prefetching, ASaP achieves 1.28× speedup while Ainsworth & Jones fails to generate prefetches. Our experiments reveal that disabling inaccurate hardware prefetchers frees critical resources for software prefetching, suggesting future architectures should expose prefetcher control as an optimization interface.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/03e045f9-da7e-4f34-b15a-446db55bbd5d

author

Sotiropoulos, Konstantinos ; Skeppstedt, Jonas ^LU

and Stenström, Per ^LU

organization

publishing date

2025-11-15

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

keywords

software prefetching, sparse data structures, sparse tensors

host publication

Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops

series title

Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops

pages

11 pages

publisher

Association for Computing Machinery (ACM)

conference name

2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops

conference location

St. Louis, United States

conference dates

2025-11-16 - 2025-11-21

external identifiers

scopus:105023391502

ISBN

9798400718717

DOI

10.1145/3731599.3767477

language

English

LU publication?

yes

additional info

id

03e045f9-da7e-4f34-b15a-446db55bbd5d

date added to LUP

2026-01-22 10:37:13

date last changed

2026-01-22 10:37:30

@inproceedings{03e045f9-da7e-4f34-b15a-446db55bbd5d,
  abstract     = {{<p>Sparse tensor computations suffer from irregular memory access patterns that degrade cache performance. While software prefetching can mitigate this, existing compiler approaches lack the semantic insight needed for effective optimization. We present ASaP, an automatic software prefetching framework integrated within MLIR’s sparse tensor dialect. By leveraging semantic information-tensor formats and loop structure-available during sparsification, ASaP determines accurate buffer bounds and injects prefetches in both innermost and outer loops, achieving broader coverage than prior work. Evaluated on SuiteSparse matrices, ASaP demonstrates significant performance gains for unstructured matrices. For SpMV with innermost-loop prefetching, ASaP achieves 1.38× speedup over Ainsworth &amp; Jones. For SpMM with outer-loop prefetching, ASaP achieves 1.28× speedup while Ainsworth &amp; Jones fails to generate prefetches. Our experiments reveal that disabling inaccurate hardware prefetchers frees critical resources for software prefetching, suggesting future architectures should expose prefetcher control as an optimization interface.</p>}},
  author       = {{Sotiropoulos, Konstantinos and Skeppstedt, Jonas and Stenström, Per}},
  booktitle    = {{Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops}},
  isbn         = {{9798400718717}},
  keywords     = {{software prefetching; sparse data structures; sparse tensors}},
  language     = {{eng}},
  month        = {{11}},
  pages        = {{1017--1027}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  series       = {{Proceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops}},
  title        = {{ASaP : Automatic Software Prefetching for Sparse Tensor Computations in MLIR}},
  url          = {{http://dx.doi.org/10.1145/3731599.3767477}},
  doi          = {{10.1145/3731599.3767477}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

ASaP : Automatic Software Prefetching for Sparse Tensor Computations in MLIR