Results 61 to 70 of about 103,551 (264)
Remote photoplethysmography (rPPG) is a promising contactless technology that uses videos of faces to extract health parameters, such as heart rate. Several methods for transforming red, green, and blue (RGB) video signals into rPPG signals have been ...
Fridolin Haugg +2 more
doaj +1 more source
Long-Term Image Boundary Prediction
Boundary estimation in images and videos has been a very active topic of research, and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception. While prior work has focused on estimating boundaries
Bhattacharyya, Apratim +3 more
core +1 more source
Learning Highly Dynamic Skills Transition for Quadruped Jumping Through Constrained Space
A quadruped robot masters dynamic jumps through constrained spaces with animal‐inspired moves and intelligent vision control. This hierarchical learning approach combines imitation of biological agility with real‐time trajectory planning. Although legged animals are capable of performing explosive motions while traversing confined spaces, replicating ...
Zeren Luo +6 more
wiley +1 more source
Weighting Quantization Matrices for HEVC/H.265-Coded RGB Videos
In the HEVC/H.265 video coding standard, weighting quantization matrices (WQMs) are supported to take advantage of the characteristics of the human visual system (HVS).
Xiwu Shang +5 more
doaj +1 more source
Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
Motion representation plays a vital role in human action recognition in videos. In this study, we introduce a novel compact motion representation for video action recognition, named Optical Flow guided Feature (OFF), which enables the network to distill ...
Kuang, Zhanghui +4 more
core +1 more source
Human activity recognition in RGB-D videos by dynamic images [PDF]
Human Activity Recognition in RGB-D videos has been an active research topic during the last decade. However, no efforts have been found in the literature, for recognizing human activity in RGB-D videos where several performers are performing simultaneously.
Snehasis Mukherjee +2 more
openaire +2 more sources
Grounding Large Language Models for Robot Task Planning Using Closed‐Loop State Feedback
BrainBody‐Large Language Model (LLM) introduces a hierarchical, feedback‐driven planning framework where two LLMs coordinate high‐level reasoning and low‐level control for robotic tasks. By grounding decisions in real‐time state feedback, it reduces hallucinations and improves task reliability.
Vineet Bhat +4 more
wiley +1 more source
Skeleton Focused Human Activity Recognition in RGB Video
The data-driven approach that learns an optimal representation of vision features like skeleton frames or RGB videos is currently a dominant paradigm for activity recognition. While great improvements have been achieved from existing single modal approaches with increasingly larger datasets, the fusion of various data modalities at the feature level ...
Yu, Bruce X. B. +2 more
openaire +2 more sources
TacScope: A Miniaturized Vision‐Based Tactile Sensor for Surgical Applications
TacScope is a compact, vision‐based tactile sensor designed for robot‐assisted surgery. By leveraging a curved elastomer surface with pressure‐sensitive particle redistribution, it captures high‐resolution 3D tactile feedback. TacScope enables accurate tumor detection and shape classification beneath soft tissue phantoms, offering a scalable, low‐cost ...
Md Rakibul Islam Prince +3 more
wiley +1 more source
Continual Learning for Multimodal Data Fusion of a Soft Gripper
Models trained on a single data modality often struggle to generalize when exposed to a different modality. This work introduces a continual learning algorithm capable of incrementally learning different data modalities by leveraging both class‐incremental and domain‐incremental learning scenarios in an artificial environment where labeled data is ...
Nilay Kushawaha, Egidio Falotico
wiley +1 more source

