Results 81 to 90 of about 1,365,020 (307)
An Automated Approach for Classification of Action and Dialogue Video with Tagging
Today data is not only constrained to text format; it has been expanded to multi-media data as well. As compared to audio and images, video data needs an attention, due to ever increasing size of videos and their massive storage size.
Rida Shifa, Akmal Shahbaz
doaj +5 more sources
Regions of interest selection in the tasks of contactless human pulse measurement by analyzing the RGB video stream [PDF]
This paper is devoted to improving the accuracy of human pulse measurement by RGB video stream analysis. For this purpose, a study was conducted the influence of the size, location and stability of the region of interest on contactless human pulse ...
Fisunov Alexander V. +2 more
doaj +1 more source
Ordered Pooling of Optical Flow Sequences for Action Recognition
Training of Convolutional Neural Networks (CNNs) on long video sequences is computationally expensive due to the substantial memory requirements and the massive number of parameters that deep architectures demand.
Cherian, Anoop +2 more
core +1 more source
Skeleton Focused Human Activity Recognition in RGB Video
The data-driven approach that learns an optimal representation of vision features like skeleton frames or RGB videos is currently a dominant paradigm for activity recognition. While great improvements have been achieved from existing single modal approaches with increasingly larger datasets, the fusion of various data modalities at the feature level ...
Yu, Bruce X. B. +2 more
openaire +2 more sources
Learning Highly Dynamic Skills Transition for Quadruped Jumping Through Constrained Space
A quadruped robot masters dynamic jumps through constrained spaces with animal‐inspired moves and intelligent vision control. This hierarchical learning approach combines imitation of biological agility with real‐time trajectory planning. Although legged animals are capable of performing explosive motions while traversing confined spaces, replicating ...
Zeren Luo +6 more
wiley +1 more source
Grounding Large Language Models for Robot Task Planning Using Closed‐Loop State Feedback
BrainBody‐Large Language Model (LLM) introduces a hierarchical, feedback‐driven planning framework where two LLMs coordinate high‐level reasoning and low‐level control for robotic tasks. By grounding decisions in real‐time state feedback, it reduces hallucinations and improves task reliability.
Vineet Bhat +4 more
wiley +1 more source
Depth camera based dataset of hand gestures
The dataset contains RGB and depth version video frames of various hand movements captured with the Intel RealSense Depth Camera D435. The camera has two channels for collecting both RGB and depth frames at the same time.
Sindhusha Jeeru +5 more
doaj +1 more source
TacScope: A Miniaturized Vision‐Based Tactile Sensor for Surgical Applications
TacScope is a compact, vision‐based tactile sensor designed for robot‐assisted surgery. By leveraging a curved elastomer surface with pressure‐sensitive particle redistribution, it captures high‐resolution 3D tactile feedback. TacScope enables accurate tumor detection and shape classification beneath soft tissue phantoms, offering a scalable, low‐cost ...
Md Rakibul Islam Prince +3 more
wiley +1 more source
Temporal and Spatial Denoising of Depth Maps
This work presents a procedure for refining depth maps acquired using RGB-D (depth) cameras. With numerous new structured-light RGB-D cameras, acquiring high-resolution depth maps has become easy. However, there are problems such as undesired occlusion,
Bor-Shing Lin +4 more
doaj +1 more source
Two-Stream RNN/CNN for Action Recognition in 3D Videos
The recognition of actions from video sequences has many applications in health monitoring, assisted living, surveillance, and smart homes. Despite advances in sensing, in particular related to 3D video, the methodologies to process the data are still ...
Ali, Haider +2 more
core +1 more source

