Results 1 to 10 of about 5,349,509 (291)

Rethinking the Inception Architecture for Computer Vision [PDF]

open access: yesComputer Vision and Pattern Recognition, 2015
Convolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks.
Ioffe, Sergey   +4 more
core   +4 more sources

Computer Vision [PDF]

open access: yes
AbstractThe field of computer vision studies how computers can gain understanding from images and videos, similar to human cognitive abilities. One of the classical challenges is to reconstruct a 3D object from images taken by several unknown cameras.
Md Atiqur Rahman Ahad   +3 more
core   +7 more sources

Computer Vision

open access: yesInternational Journal for Research in Applied Science and Engineering Technology, 2021
Computer vision may be a field of computer science that trains computers to interpret and perceive the visual world. exploitation digital pictures from cameras and videos and deep learning models, machines will accurately determine and classify objects — and so react to what they "see.”.
Rajesh Singh   +3 more
  +9 more sources

A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS [PDF]

open access: yesMachine Learning and Knowledge Extraction, 2023
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO’s evolution, examining the innovations and contributions in each iteration from the ...
Juan R. Terven   +2 more
semanticscholar   +1 more source

Attention mechanisms in computer vision: A survey [PDF]

open access: yesComputational Visual Media, 2021
Humans can naturally and effectively find salient regions in complex scenes. Motivated by this observation, attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system.
Meng-Hao Guo   +9 more
semanticscholar   +1 more source

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [PDF]

open access: yesIEEE International Conference on Computer Vision, 2021
This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer from language to vision arise from differences between the two domains, such ...
Ze Liu   +7 more
semanticscholar   +1 more source

Masked Autoencoders Are Scalable Vision Learners [PDF]

open access: yesComputer Vision and Pattern Recognition, 2021
This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs.
Kaiming He   +5 more
semanticscholar   +1 more source

Hyperbolic Deep Learning in Computer Vision: A Survey [PDF]

open access: yesInternational Journal of Computer Vision, 2023
Deep representation learning is a ubiquitous part of modern computer vision. While Euclidean space has been the de facto standard manifold for learning visual representations, hyperbolic space has recently gained rapid traction for learning in computer ...
Pascal Mettes   +4 more
semanticscholar   +1 more source

Context Understanding in Computer Vision: A Survey [PDF]

open access: yesComputer Vision and Image Understanding, 2023
Contextual information plays an important role in many computer vision tasks, such as object detection, video action detection, image classification, etc.
Xuan Wang, Zhigang Zhu
semanticscholar   +1 more source

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions [PDF]

open access: yesIEEE International Conference on Computer Vision, 2021
Although convolutional neural networks (CNNs) have achieved great success in computer vision, this work investigates a simpler, convolution-free backbone network use-fid for many dense prediction tasks.
Wenhai Wang   +8 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy