Zero-Shot Segmentation of Eye Features Using the Segment Anything Model (SAM)

Maquiling, Virmarie; Byrne, Sean Anthony; Niehorster, Diederick C.; Nyström, Marcus; Kasneci, Enkelejda

Zero-Shot Segmentation of Eye Features Using the Segment Anything Model (SAM)

Mark

Maquiling, Virmarie ; Byrne, Sean Anthony ; Niehorster, Diederick C. ^LU

; Nyström, Marcus ^LU

and Kasneci, Enkelejda (2024) In Proceedings of the ACM on Computer Graphics and Interactive Techniques 7(2).

Abstract: The advent of foundation models signals a new era in artificial intelligence. The Segment Anything Model (SAM) is the first foundation model for image segmentation. In this study, we evaluate SAM's ability to segment features from eye images recorded in virtual reality setups. The increasing requirement for annotated eye-image datasets presents a significant opportunity for SAM to redefine the landscape of data annotation in gaze estimation. Our investigation centers on SAM's zero-shot learning abilities and the effectiveness of prompts like bounding boxes or point clicks. Our results are consistent with studies in other domains, demonstrating that SAM's segmentation effectiveness can be on-par with specialized models depending on the... (More); The advent of foundation models signals a new era in artificial intelligence. The Segment Anything Model (SAM) is the first foundation model for image segmentation. In this study, we evaluate SAM's ability to segment features from eye images recorded in virtual reality setups. The increasing requirement for annotated eye-image datasets presents a significant opportunity for SAM to redefine the landscape of data annotation in gaze estimation. Our investigation centers on SAM's zero-shot learning abilities and the effectiveness of prompts like bounding boxes or point clicks. Our results are consistent with studies in other domains, demonstrating that SAM's segmentation effectiveness can be on-par with specialized models depending on the feature, with prompts improving its performance, evidenced by an IoU of 93.34% for pupil segmentation in one dataset. Foundation models like SAM could revolutionize gaze estimation by enabling quick and easy image segmentation, reducing reliance on specialized models and extensive manual annotation.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/5a73b735-a370-4b49-982b-039a45cff2ca

author

Maquiling, Virmarie ; Byrne, Sean Anthony ; Niehorster, Diederick C. ^LU

; Nyström, Marcus ^LU

and Kasneci, Enkelejda

organization

publishing date

2024-05-17

type

Contribution to journal

publication status

published

subject

Computer graphics and computer vision

keywords

Eye-tracking, Foundational models, Prompt Engineering, Segment Anything Model, Segmentation, Zero-shot learning

in

Proceedings of the ACM on Computer Graphics and Interactive Techniques

volume

7

issue

2

article number

26

publisher

Association for Computing Machinery (ACM)

external identifiers

scopus:85193965332

ISSN

2577-6193

DOI

10.1145/3654704

language

English

LU publication?

yes

id

5a73b735-a370-4b49-982b-039a45cff2ca

date added to LUP

2024-05-31 10:47:21

date last changed

2025-04-04 14:56:57

@article{5a73b735-a370-4b49-982b-039a45cff2ca,
  abstract     = {{<p>The advent of foundation models signals a new era in artificial intelligence. The Segment Anything Model (SAM) is the first foundation model for image segmentation. In this study, we evaluate SAM's ability to segment features from eye images recorded in virtual reality setups. The increasing requirement for annotated eye-image datasets presents a significant opportunity for SAM to redefine the landscape of data annotation in gaze estimation. Our investigation centers on SAM's zero-shot learning abilities and the effectiveness of prompts like bounding boxes or point clicks. Our results are consistent with studies in other domains, demonstrating that SAM's segmentation effectiveness can be on-par with specialized models depending on the feature, with prompts improving its performance, evidenced by an IoU of 93.34% for pupil segmentation in one dataset. Foundation models like SAM could revolutionize gaze estimation by enabling quick and easy image segmentation, reducing reliance on specialized models and extensive manual annotation.</p>}},
  author       = {{Maquiling, Virmarie and Byrne, Sean Anthony and Niehorster, Diederick C. and Nyström, Marcus and Kasneci, Enkelejda}},
  issn         = {{2577-6193}},
  keywords     = {{Eye-tracking; Foundational models; Prompt Engineering; Segment Anything Model; Segmentation; Zero-shot learning}},
  language     = {{eng}},
  month        = {{05}},
  number       = {{2}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  series       = {{Proceedings of the ACM on Computer Graphics and Interactive Techniques}},
  title        = {{Zero-Shot Segmentation of Eye Features Using the Segment Anything Model (SAM)}},
  url          = {{http://dx.doi.org/10.1145/3654704}},
  doi          = {{10.1145/3654704}},
  volume       = {{7}},
  year         = {{2024}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Zero-Shot Segmentation of Eye Features Using the Segment Anything Model (SAM)