Enhancing Long-Term Action Quality Assessment: A Dual-Modality Dataset and Causal Cross-Modal Framework for Trampoline Gymnastics. [PDF]
Lin F, Huang J, Chen Z, Zhu K, Feng C.
europepmc +1 more source
M3OT: A Multi-Drone Multi-Modality dataset for Multi-Object Tracking. [PDF]
Nie Z +5 more
europepmc +1 more source
STFormer: Spatio‐temporal former for hand–object interaction recognition from egocentric RGB video [PDF]
Jiao Liang +3 more
openalex +1 more source
ADAT novel time-series-aware adaptive transformer architecture for sign language translation. [PDF]
Shahin N, Ismail L.
europepmc +1 more source
A Unified GAN-Based Framework for Unsupervised Video Anomaly Detection Using Optical Flow and RGB Cues. [PDF]
Kang SH, Kang HS.
europepmc +1 more source
A deep learning-based method combines manual and non-manual features for sign language recognition. [PDF]
Harrouch H +3 more
europepmc +1 more source
SAT: shift alignment transformer for video denoising without flow estimation. [PDF]
Zhang X, Fan S, Zhang H, Gao Y, Hu Y.
europepmc +1 more source
3D facial performance capture from monocular RGB video.
3D facial performance capture is an essential technique for animation production in featured films, video gaming, human computer interaction, VR/AR asset creation and digital heritage, which all have huge impact on our daily life. Traditionally, dedicated hardware such as depth sensors, laser scanners and camera arrays have been developed to acquire ...
openaire
Forest Inspection Dataset: A Synthetic UAV Dataset for Semantic Segmentation of Forest Environments. [PDF]
Blaga BC, Nedevschi S.
europepmc +1 more source

