Dynamic Category Queries Transformer for Generalized Few-shot Semantic Segmentation
(2025) 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings- Abstract
Few-shot segmentation (FSS) tackles data scarcity using multiple priors, but its simplicity limits handling base and novel classes with limited data access. Generalized few-shot semantic segmentation (GFSS) enhances model performance for base classes with abundant data, while novel classes have limited data access, improving generalization with scarce data. Building on the design of query-based segmentation models, which decouple the mask and classification tasks for individual optimization, we here present the Dynamic Category Queries Transformer (DCQ-Former) which forms a novel approach to the GFSS. The proposed DCQ-Former first uses category suggested dynamic queries to perform mask segmentation and category classification tasks on a... (More)
Few-shot segmentation (FSS) tackles data scarcity using multiple priors, but its simplicity limits handling base and novel classes with limited data access. Generalized few-shot semantic segmentation (GFSS) enhances model performance for base classes with abundant data, while novel classes have limited data access, improving generalization with scarce data. Building on the design of query-based segmentation models, which decouple the mask and classification tasks for individual optimization, we here present the Dynamic Category Queries Transformer (DCQ-Former) which forms a novel approach to the GFSS. The proposed DCQ-Former first uses category suggested dynamic queries to perform mask segmentation and category classification tasks on a large amount of base class data. Considering the case when the novel classes only have access to a limited amount of training data, the queries for the novel classes are instead dynamically composed from the base classes in order to prevent the category suggested module from providing limited suggestion queries given the representativeness of the few-shot samples. Extensive experiments on COCO-20i and Pascal-5i datasets show that DCQ-Former achieves superior accuracy and generalization than current state-of-the-art methods. Our code are available at https://github.com/fallpavilion/DCQ-Former.
(Less)
- author
- Huang, Kunze
; Yang, Jieyuan
; Jakobsson, Andreas
LU
; Tang, Luyao
; Tu, Xiaotong
; Ding, Xinghao
and Huang, Yue
- organization
- publishing date
- 2025
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- few-shot learning, query-based transformer, semantic segmentation
- in
- ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
- conference name
- 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
- conference location
- Hyderabad, India
- conference dates
- 2025-04-06 - 2025-04-11
- external identifiers
-
- scopus:105009592845
- ISSN
- 1520-6149
- DOI
- 10.1109/ICASSP49660.2025.10890147
- language
- English
- LU publication?
- yes
- additional info
- Publisher Copyright: © 2025 IEEE.
- id
- b37d8558-ecc1-4c5a-bc03-08495ae2f24a
- date added to LUP
- 2026-01-14 16:01:25
- date last changed
- 2026-01-15 12:33:45
@article{b37d8558-ecc1-4c5a-bc03-08495ae2f24a,
abstract = {{<p>Few-shot segmentation (FSS) tackles data scarcity using multiple priors, but its simplicity limits handling base and novel classes with limited data access. Generalized few-shot semantic segmentation (GFSS) enhances model performance for base classes with abundant data, while novel classes have limited data access, improving generalization with scarce data. Building on the design of query-based segmentation models, which decouple the mask and classification tasks for individual optimization, we here present the Dynamic Category Queries Transformer (DCQ-Former) which forms a novel approach to the GFSS. The proposed DCQ-Former first uses category suggested dynamic queries to perform mask segmentation and category classification tasks on a large amount of base class data. Considering the case when the novel classes only have access to a limited amount of training data, the queries for the novel classes are instead dynamically composed from the base classes in order to prevent the category suggested module from providing limited suggestion queries given the representativeness of the few-shot samples. Extensive experiments on COCO-20<sup>i</sup> and Pascal-5<sup>i</sup> datasets show that DCQ-Former achieves superior accuracy and generalization than current state-of-the-art methods. Our code are available at https://github.com/fallpavilion/DCQ-Former.</p>}},
author = {{Huang, Kunze and Yang, Jieyuan and Jakobsson, Andreas and Tang, Luyao and Tu, Xiaotong and Ding, Xinghao and Huang, Yue}},
issn = {{1520-6149}},
keywords = {{few-shot learning; query-based transformer; semantic segmentation}},
language = {{eng}},
series = {{ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings}},
title = {{Dynamic Category Queries Transformer for Generalized Few-shot Semantic Segmentation}},
url = {{http://dx.doi.org/10.1109/ICASSP49660.2025.10890147}},
doi = {{10.1109/ICASSP49660.2025.10890147}},
year = {{2025}},
}