Zero-Shot Pupil Segmentation with SAM 2 : A Case Study of Over 14 Million Images
(2025) In Proceedings of the ACM on Computer Graphics and Interactive Techniques 8(2). p.1-16- Abstract
- We explore the transformative potential of SAM 2, a vision foundation model, in advancing gaze estimation. SAM 2 addresses key challenges in gaze estimation by significantly reducing annotation time, simplifying deployment, and enhancing segmentation accuracy. Utilizing its zero-shot capabilities with minimal user input—a single click per video—we tested SAM 2 on over 14 million eye images from a diverse range of datasets, including the EDS challenge datasets and Labelled Pupils in the Wild. This is the first application of SAM 2 to the gaze estimation domain. Remarkably, SAM 2 matches the performance of domain-specific models in pupil segmentation, achieving competitive mIOU scores of up to 93% without fine-tuning. We argue that SAM 2... (More)
- We explore the transformative potential of SAM 2, a vision foundation model, in advancing gaze estimation. SAM 2 addresses key challenges in gaze estimation by significantly reducing annotation time, simplifying deployment, and enhancing segmentation accuracy. Utilizing its zero-shot capabilities with minimal user input—a single click per video—we tested SAM 2 on over 14 million eye images from a diverse range of datasets, including the EDS challenge datasets and Labelled Pupils in the Wild. This is the first application of SAM 2 to the gaze estimation domain. Remarkably, SAM 2 matches the performance of domain-specific models in pupil segmentation, achieving competitive mIOU scores of up to 93% without fine-tuning. We argue that SAM 2 achieves the sought-after standard of domain generalization, with consistent mIOU scores (89.71%-93.74%) across diverse datasets, from virtual reality to "gaze-in-the-wild" scenarios. We provide our code and segmentation masks for these datasets to promote further research. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/8ff9fd0f-38de-4a82-bd02-500b02414d4a
- author
- Maquiling, Virmarie
; Byrne, Sean Anthony
; Niehorster, Diederick C.
LU
; Carminati, Marco and Kasneci, Enkelejda
- organization
- publishing date
- 2025-06-01
- type
- Contribution to journal
- publication status
- published
- subject
- in
- Proceedings of the ACM on Computer Graphics and Interactive Techniques
- volume
- 8
- issue
- 2
- article number
- 23
- pages
- 1 - 16
- publisher
- Association for Computing Machinery (ACM)
- ISSN
- 2577-6193
- DOI
- 10.1145/3729409
- language
- English
- LU publication?
- yes
- id
- 8ff9fd0f-38de-4a82-bd02-500b02414d4a
- date added to LUP
- 2025-05-31 19:16:35
- date last changed
- 2025-06-13 14:41:14
@article{8ff9fd0f-38de-4a82-bd02-500b02414d4a, abstract = {{We explore the transformative potential of SAM 2, a vision foundation model, in advancing gaze estimation. SAM 2 addresses key challenges in gaze estimation by significantly reducing annotation time, simplifying deployment, and enhancing segmentation accuracy. Utilizing its zero-shot capabilities with minimal user input—a single click per video—we tested SAM 2 on over 14 million eye images from a diverse range of datasets, including the EDS challenge datasets and Labelled Pupils in the Wild. This is the first application of SAM 2 to the gaze estimation domain. Remarkably, SAM 2 matches the performance of domain-specific models in pupil segmentation, achieving competitive mIOU scores of up to 93% without fine-tuning. We argue that SAM 2 achieves the sought-after standard of domain generalization, with consistent mIOU scores (89.71%-93.74%) across diverse datasets, from virtual reality to "gaze-in-the-wild" scenarios. We provide our code and segmentation masks for these datasets to promote further research.}}, author = {{Maquiling, Virmarie and Byrne, Sean Anthony and Niehorster, Diederick C. and Carminati, Marco and Kasneci, Enkelejda}}, issn = {{2577-6193}}, language = {{eng}}, month = {{06}}, number = {{2}}, pages = {{1--16}}, publisher = {{Association for Computing Machinery (ACM)}}, series = {{Proceedings of the ACM on Computer Graphics and Interactive Techniques}}, title = {{Zero-Shot Pupil Segmentation with SAM 2 : A Case Study of Over 14 Million Images}}, url = {{http://dx.doi.org/10.1145/3729409}}, doi = {{10.1145/3729409}}, volume = {{8}}, year = {{2025}}, }