Results 1 to 10 of about 569,009 (312)
Time series classification through visual pattern recognition
In this paper, a new approach to time series classification is proposed. It transforms the scalar time series into a two-dimensional space of amplitude (time series values) and a change of amplitude (increment).
Agnieszka Jastrzebska
doaj +2 more sources
Optimal Transport Aggregation for Visual Place Recognition [PDF]
The task of Visual Place Recognition (VPR) aims to match a query image against references from an extensive database of images from different places, relying solely on visual cues.
Sergio Izquierdo, Javier Civera
semanticscholar +1 more source
Improved Baselines with Visual Instruction Tuning [PDF]
Large multimodal models (LMM) have recently shown encouraging progress with visual instruction tuning. In this paper, we present the first systematic study to investigate the design choices of LMMs in a controlled setting under the LLaVA framework.
Haotian Liu +3 more
semanticscholar +1 more source
Multimodal Prompting with Missing Modalities for Visual Recognition [PDF]
In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when missing-modality occurs either during training or testing in real-world situations; and 2) when the computation resources are not available to finetune on ...
Yi-Lun Lee +3 more
semanticscholar +1 more source
Bottleneck Transformers for Visual Recognition [PDF]
We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation.
A. Srinivas +5 more
semanticscholar +1 more source
Balanced Contrastive Learning for Long-Tailed Visual Recognition [PDF]
Real-world data typically follow a long-tailed distribution, where a few majority categories occupy most of the data while most minority categories contain a limited number of samples.
Jianggang Zhu +4 more
semanticscholar +1 more source
VOLO: Vision Outlooker for Visual Recognition [PDF]
Recently, Vision Transformers (ViTs) have been broadly explored in visual recognition. With low efficiency in encoding fine-level features, the performance of ViTs is still inferior to the state-of-the-art CNNs when trained from scratch on a midsize ...
Li Yuan +4 more
semanticscholar +1 more source
An Empirical Study on the Effects of Temporal Trends in Spatial Patterns on Animated Choropleth Maps
Animated cartographic visualization incorporates the concept of geomedia presented in this Special Issue. The presented study aims to examine the effectiveness of spatial pattern and temporal trend recognition on animated choropleth maps. In a controlled
Paweł Cybulski
doaj +1 more source
Involution: Inverting the Inherence of Convolution for Visual Recognition [PDF]
Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel ...
Duo Li +7 more
semanticscholar +1 more source
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition [PDF]
Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224$\times$ 224) input image. This requirement is “artificial” and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale.
Kaiming He +3 more
semanticscholar +1 more source

