Results 301 to 310 of about 488,216 (336)
Some of the next articles are maybe not open access.
Aria-UI: Visual Grounding for GUI Instructions
Annual Meeting of the Association for Computational LinguisticsDigital agents for automating tasks across different platforms by directly manipulating the GUIs are increasingly important. For these agents, grounding from language instructions to target elements remains a significant challenge due to reliance on HTML
Yuhao Yang +6 more
semanticscholar +1 more source
GroundingGPT:Language Enhanced Multi-modal Grounding Model
Annual Meeting of the Association for Computational LinguisticsMulti-modal large language models have demonstrated impressive performance across various tasks in different modalities. However, existing multi-modal models primarily emphasize capturing global information within each modality while neglecting the ...
Zhaowei Li +11 more
semanticscholar +1 more source
Science, 2021
Fleets of radar satellites are measuring movements on Earth like never before.
openaire +2 more sources
Fleets of radar satellites are measuring movements on Earth like never before.
openaire +2 more sources
HawkEye: Training Video-Text LLMs for Grounding Text in Videos
arXiv.orgVideo-text Large Language Models (video-text LLMs) have shown remarkable performance in answering questions and holding conversations on simple videos.
Yueqian Wang +5 more
semanticscholar +1 more source
The Journal of the Royal Aeronautical Society, 1954
The problems of noise on the ground coming from aircraft on the ground, while still not simple to resolve, are rather less difficult than those associated with noise coming from aircraft in flight. The source of noise—the aeroplane—is no longer travelling rapidly through the air with three degrees of freedom but is either stationary on the ground or ...
openaire +1 more source
The problems of noise on the ground coming from aircraft on the ground, while still not simple to resolve, are rather less difficult than those associated with noise coming from aircraft in flight. The source of noise—the aeroplane—is no longer travelling rapidly through the air with three degrees of freedom but is either stationary on the ground or ...
openaire +1 more source
The Journal of Dermatologic Surgery and Oncology, 1988
Abstract. Dispersive electrodes are often neglected or misused in electrosurgery. A dispersive electrode can increase electrosurgical safety and effectiveness. If misused, however, it can become the source of patient injury. The following article summarizes the proper use of dispersive electrodes and differentiates them from actual grounding.
openaire +2 more sources
Abstract. Dispersive electrodes are often neglected or misused in electrosurgery. A dispersive electrode can increase electrosurgical safety and effectiveness. If misused, however, it can become the source of patient injury. The following article summarizes the proper use of dispersive electrodes and differentiates them from actual grounding.
openaire +2 more sources
An Open and Comprehensive Pipeline for Unified Object Grounding and Detection
arXiv.orgGrounding-DINO is a state-of-the-art open-set detection model that tackles multiple vision tasks including Open-Vocabulary Detection (OVD), Phrase Grounding (PG), and Referring Expression Comprehension (REC).
Xiangyu Zhao +6 more
semanticscholar +1 more source
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Conference on Empirical Methods in Natural Language ProcessingVideo Large Language Models (Video-LLMs) have demonstrated remarkable capabilities in coarse-grained video understanding, however, they struggle with fine-grained temporal grounding.
Haibo Wang +8 more
semanticscholar +1 more source
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
European Conference on Computer VisionAlthough great progress has been made in 3D visual grounding, current models still rely on explicit textual descriptions for grounding and lack the ability to reason human intentions from implicit instructions.
Chenming Zhu +4 more
semanticscholar +1 more source
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)Video Temporal Grounding (VTG) aims to ground specific segments within an untrimmed video corresponding to the given natural language query. Existing VTG methods largely depend on supervised learning and extensive annotated data, which is labor-intensive
Mengxue Qu +4 more
semanticscholar +1 more source

