Results 291 to 300 of about 488,216 (336)
Some of the next articles are maybe not open access.

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

arXiv.org
Graphical user interface (GUI) grounding, the ability to map natural language instructions to specific actions on graphical user interfaces, remains a critical bottleneck in computer use agent development.
Tianbao Xie   +14 more
semanticscholar   +1 more source

SeqTR: A Simple yet Universal Network for Visual Grounding

European Conference on Computer Vision, 2022
In this paper, we propose a simple yet universal network termed SeqTR for visual grounding tasks, e.g., phrase localization, referring expression comprehension (REC) and segmentation (RES).
Chaoyang Zhu   +9 more
semanticscholar   +1 more source

Grounds and ‘Grounds’

Canadian Journal of Philosophy, 2017
AbstractIn this paper, I offer a new theory of grounding. The theory has is that grounding is a job description that is realized by different properties in different contexts. Those properties play the grounding role contingently, and grounding is the property that plays the grounding role essentially.
openaire   +1 more source

Grounds, grounds, and more grounds

SIMULATION, 1965
This paper discusses the need for proper grounding of electrical equipment from the standpoint of safe ty and performance. Seventeen diagrams are em ployed to illustrate the most important points. The discussion follows a fundamental vein in illus trating the proper connections of electrical apparatus to specifically avoid difficulties from common im
openaire   +1 more source

UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

arXiv.org
Traditional visual grounding methods primarily focus on single-image scenarios with simple textual references. However, extending these methods to real-world scenarios that involve implicit and complex instructions, particularly in conjunction with ...
Sule Bai   +7 more
semanticscholar   +1 more source

Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding

Computer Vision and Pattern Recognition
Visual grounding seeks to localize the image region corresponding to a free-form text description. Recently, the strong multimodal capabilities of Large Vision-Language Models (LVLMs) have driven substantial improvements in visual grounding, though they ...
Seil Kang   +3 more
semanticscholar   +1 more source

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding

IEEE International Conference on Computer Vision, 2023
Understanding 3D scenes from multi-view inputs has been proven to alleviate the view discrepancy issue in 3D visual grounding. However, existing methods normally neglect the view cues embedded in the text modality and fail to weigh the relative ...
Ziyu Guo   +6 more
semanticscholar   +1 more source

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

arXiv.org
This paper introduces Grounding DINO 1.5, a suite of advanced open-set object detection models developed by IDEA Research, which aims to advance the"Edge"of open-set object detection.
Tianhe Ren   +15 more
semanticscholar   +1 more source

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

European Conference on Computer Vision
We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability. Beyond holistic image understanding, Groma is adept at region-level tasks such as region captioning and visual grounding.
Chuofan Ma   +4 more
semanticscholar   +1 more source

Ground One

Bulletin of the Menninger Clinic, 2003
The events of September 11th, 2001, have had long-lasting effects on our culture, interpersonal relationships, understanding of evil intent and terrorism, and approach to and treatment of trauma states. This article is a personal account of September 11th by a junior attending psychiatrist as he experienced it from St.
openaire   +2 more sources

Home - About - Disclaimer - Privacy