Probabilistic Feature Matching for Fast Scalable Visual Prompting

Probabilistic Feature Matching for Fast Scalable Visual Prompting

Thomas Frick, Cezary Skura, Filip M. Janicki, Roy Assaf, Niccolo Avogaro, Daniel Caraballo, Yagmur G. Cinar, Brown Ebouky, Ioana Giurgiu, Takayuki Katsuki, Piotr Kluska, Cristiano Malossi, Haoxiang Qiu, Tomoya Sakai, Florian Scheidegger, Andrej Simeski, Daniel Yang, Andrea Bartezzaghi, Mattia Rigotti

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Demo Track. Pages 8648-8652. https://doi.org/10.24963/ijcai.2024/1000

In this work, we propose a novel framework for image segmentation guided by visual prompting which leverages the power of vision foundation models. Inspired by recent advancements in computer vision, our approach integrates multiple large-scale pretrained models to address the challenges of segmentation tasks with limited and sparsely annotated data interactively provided by a user. Our method combines a frozen feature extraction backbone with a scalable and efficient probabilistic feature correspondence (soft matching) procedure derived from Optimal Transport to couple pixels between reference and target images. Moreover, a pretrained segmentation model is harnessed to translate user scribbles into reference masks and matched target pixels into output target segmentation masks. This results in a framework that we name Softmatcher, a versatile and fast training-free architecture for image segmentation by visual prompting. We demonstrate the efficiency and scalability of Softmatcher for real-time interactive image segmentation by visual prompting and showcase it in diverse visual domains including technical visual inspection use cases.
Keywords:
Computer Vision: CV: Segmentation
Computer Vision: CV: Applications
Computer Vision: CV: Machine learning for vision
Humans and AI: HAI: Human-computer interaction