Zero-Shot Pupil Segmentation with SAM 2 : A Case Study of Over 14 Million Images

Maquiling, Virmarie; Byrne, Sean Anthony; Niehorster, Diederick C.; Carminati, Marco; Kasneci, Enkelejda

Zero-Shot Pupil Segmentation with SAM 2 : A Case Study of Over 14 Million Images

Mark

Maquiling, Virmarie ; Byrne, Sean Anthony ; Niehorster, Diederick C. ^LU

; Carminati, Marco and Kasneci, Enkelejda (2025) In Proceedings of the ACM on Computer Graphics and Interactive Techniques 8(2). p.1-16

Abstract: We explore the transformative potential of SAM 2, a vision foundation model, in advancing gaze estimation. SAM 2 addresses key challenges in gaze estimation by significantly reducing annotation time, simplifying deployment, and enhancing segmentation accuracy. Utilizing its zero-shot capabilities with minimal user input—a single click per video—we tested SAM 2 on over 14 million eye images from a diverse range of datasets, including the EDS challenge datasets and Labelled Pupils in the Wild. This is the first application of SAM 2 to the gaze estimation domain. Remarkably, SAM 2 matches the performance of domain-specific models in pupil segmentation, achieving competitive mIOU scores of up to 93% without fine-tuning. We argue that SAM 2... (More); We explore the transformative potential of SAM 2, a vision foundation model, in advancing gaze estimation. SAM 2 addresses key challenges in gaze estimation by significantly reducing annotation time, simplifying deployment, and enhancing segmentation accuracy. Utilizing its zero-shot capabilities with minimal user input—a single click per video—we tested SAM 2 on over 14 million eye images from a diverse range of datasets, including the EDS challenge datasets and Labelled Pupils in the Wild. This is the first application of SAM 2 to the gaze estimation domain. Remarkably, SAM 2 matches the performance of domain-specific models in pupil segmentation, achieving competitive mIOU scores of up to 93% without fine-tuning. We argue that SAM 2 achieves the sought-after standard of domain generalization, with consistent mIOU scores (89.71%-93.74%) across diverse datasets, from virtual reality to "gaze-in-the-wild" scenarios. We provide our code and segmentation masks for these datasets to promote further research. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8ff9fd0f-38de-4a82-bd02-500b02414d4a

author

Maquiling, Virmarie ; Byrne, Sean Anthony ; Niehorster, Diederick C. ^LU

; Carminati, Marco and Kasneci, Enkelejda

organization

publishing date

2025-06-01

type

Contribution to journal

publication status

published

subject

in

Proceedings of the ACM on Computer Graphics and Interactive Techniques

volume

8

issue

2

article number

23

pages

1 - 16

publisher

Association for Computing Machinery (ACM)

external identifiers

scopus:105024339940

ISSN

2577-6193

DOI

10.1145/3729409

language

English

LU publication?

yes

id

8ff9fd0f-38de-4a82-bd02-500b02414d4a

date added to LUP

2025-05-31 19:16:35

date last changed

2026-01-26 15:58:46

@article{8ff9fd0f-38de-4a82-bd02-500b02414d4a,
  abstract     = {{We explore the transformative potential of SAM 2, a vision foundation model, in advancing gaze estimation. SAM 2 addresses key challenges in gaze estimation by significantly reducing annotation time, simplifying deployment, and enhancing segmentation accuracy. Utilizing its zero-shot capabilities with minimal user input—a single click per video—we tested SAM 2 on over 14 million eye images from a diverse range of datasets, including the EDS challenge datasets and Labelled Pupils in the Wild. This is the first application of SAM 2 to the gaze estimation domain. Remarkably, SAM 2 matches the performance of domain-specific models in pupil segmentation, achieving competitive mIOU scores of up to 93% without fine-tuning. We argue that SAM 2 achieves the sought-after standard of domain generalization, with consistent mIOU scores (89.71%-93.74%) across diverse datasets, from virtual reality to "gaze-in-the-wild" scenarios. We provide our code and segmentation masks for these datasets to promote further research.}},
  author       = {{Maquiling, Virmarie and Byrne, Sean Anthony and Niehorster, Diederick C. and Carminati, Marco and Kasneci, Enkelejda}},
  issn         = {{2577-6193}},
  language     = {{eng}},
  month        = {{06}},
  number       = {{2}},
  pages        = {{1--16}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  series       = {{Proceedings of the ACM on Computer Graphics and Interactive Techniques}},
  title        = {{Zero-Shot Pupil Segmentation with SAM 2 : A Case Study of Over 14 Million Images}},
  url          = {{http://dx.doi.org/10.1145/3729409}},
  doi          = {{10.1145/3729409}},
  volume       = {{8}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Zero-Shot Pupil Segmentation with SAM 2 : A Case Study of Over 14 Million Images