Results 1 to 10 of about 238,485 (114)
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [PDF]
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies pre-trained text-image diffusion and discriminative models to perform open-vocabulary panoptic segmentation. Text-to-image diffusion models have the remarkable ability
Jiarui Xu +5 more
semanticscholar +1 more source
Side Adapter Network for Open-Vocabulary Semantic Segmentation [PDF]
This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN). Our approach models the semantic segmentation task as a region recognition problem. A side network
Mengde Xu +4 more
semanticscholar +1 more source
Scaling Open-Vocabulary Object Detection [PDF]
Open-vocabulary object detection has benefited greatly from pretrained vision-language models, but is still limited by the amount of available detection training data.
M. Minderer, A. Gritsenko, N. Houlsby
semanticscholar +1 more source
OpenMask3D: Open-Vocabulary 3D Instance Segmentation [PDF]
We introduce the task of open-vocabulary 3D instance segmentation. Current approaches for 3D instance segmentation can typically only recognize object categories from a pre-defined closed set of classes that are annotated in the training datasets.
Ayca Takmaz +5 more
semanticscholar +1 more source
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP [PDF]
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing objects from an open set of categories. One way to address this challenge is to leverage multi-modal models, such as CLIP, to provide image and text features in a ...
Qihang Yu +4 more
semanticscholar +1 more source
A Simple Framework for Open-Vocabulary Segmentation and Detection [PDF]
We present OpenSeeD, a simple Open-vocabulary Segmentation and Detection framework that jointly learns from different segmentation and detection datasets.
Hao Zhang +7 more
semanticscholar +1 more source
Towards Open Vocabulary Learning: A Survey [PDF]
In the field of visual scene understanding, deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection.
Jianzong Wu +11 more
semanticscholar +1 more source
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP [PDF]
Open-vocabulary semantic segmentation aims to segment an image into semantic regions according to text descriptions, which may not have been seen during training. Recent two-stage methods first generate class-agnostic mask proposals and then leverage pre-
Feng Liang +8 more
semanticscholar +1 more source
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space [PDF]
Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process,
Mor Geva +3 more
semanticscholar +1 more source
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model [PDF]
Recently, vision-language pre-training shows great potential in open-vocabulary object detection, where detectors trained on base classes are devised for detecting new classes.
Yu Du +5 more
semanticscholar +1 more source

