Results 1 to 10 of about 488,117 (237)
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection [PDF]
In this paper, we present an open-set object detector, called Grounding DINO, by marrying Transformer-based detector DINO with grounded pre-training, which can detect arbitrary objects with human inputs such as category names or referring expressions ...
Shilong Liu +10 more
semanticscholar +1 more source
Grounding ‘Grounding’ in NLP [PDF]
The NLP community has seen substantial recent interest in grounding to facilitate interaction between language technologies and the world. However, as a community, we use the term broadly to reference any linking of text to data or non-textual modality. In contrast, Cognitive Science more formally defines "grounding" as the process of establishing what
Chandu, Khyathi Raghavi +2 more
openaire +2 more sources
Kosmos-2: Grounding Multimodal Large Language Models to the World [PDF]
We introduce Kosmos-2, a Multimodal Large Language Model (MLLM), enabling new capabilities of perceiving object descriptions (e.g., bounding boxes) and grounding text to the visual world. Specifically, we represent refer expressions as links in Markdown,
Zhiliang Peng +6 more
semanticscholar +1 more source
Coincident Objects and The Grounding Problem [PDF]
Pluralists believe in the occurrence of numerically distinct spatiotemporally coincident objects. They argue that there are coincident objects that share all physical and spatiotemporal properties and relations; nevertheless, they differ in terms of ...
Ataollah Hashemi
doaj +1 more source
GLaMM: Pixel Grounding Large Multimodal Model [PDF]
Large Multimodal Models (LMMs) extend Large Lan-guage Models to the vision domain. Initial LMMs used holistic images and text prompts to generate ungrounded textual responses.
H. Rasheed +9 more
semanticscholar +1 more source
Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning [PDF]
Recent works successfully leveraged Large Language Models'(LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems.
Thomas Carta +5 more
semanticscholar +1 more source
In an offshore wind farm, a high‐voltage switchgear interruption in an offshore substation creates a high‐frequency, high‐amplitude overvoltage that can cause severe electromagnetic interference problems in the intelligent electronic device.
Huaqing Wang +4 more
doaj +1 more source
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Task Planning [PDF]
Large language models (LLMs) have demonstrated impressive results in developing generalist planning agents for diverse tasks. However, grounding these plans in expansive, multi-floor, and multi-room environments presents a significant challenge for ...
Krishan Rana +5 more
semanticscholar +1 more source
Three types of bidirectional leader development in triggered lightning flashes
Eight cases of bidirectional leader (BL) development in artificially triggered lightning flashes are reported with synchronous high-speed camera images and electric field signals.
Rui Su +5 more
doaj +1 more source
UniVTG: Towards Unified Video-Language Temporal Grounding [PDF]
Video Temporal Grounding (VTG), which aims to ground target clips from videos (such as consecutive intervals or disjoint shots) according to custom language queries (e.g., sentences or words), is key for video browsing on social media.
Kevin Lin +7 more
semanticscholar +1 more source

