Multimedia cs.mm - Open Access .click

Results 31 to 40 of about 281 (47)

Automatic Edge Error Judgment in Figure Skating Using 3D Pose Estimation from a Monocular Camera and IMUs

, 2023
Automatic evaluating systems are fundamental issues in sports technologies. In many sports, such as figure skating, automated evaluating methods based on pose estimation have been proposed.
Fujii, Keisuke +3 more
core

Generative AI-enabled Mobile Tactical Multimedia Networks: Distribution, Generation, and Perception

Mobile multimedia networks (MMNs) demonstrate great potential in delivering low-latency and high-quality entertainment and tactical applications, such as short-video sharing, online conferencing, and battlefield surveillance.
Fang, Yuguang +6 more
core

User Digital Twin-Driven Video Streaming for Customized Preferences and Adaptive Transcoding

In the rapidly evolving field of multimedia services, video streaming has become increasingly prevalent, demanding innovative solutions to enhance user experience and system efficiency.
Berhane, Kalkidan, Jimmy, Stephen, Muhammad, Kevin +2 more
core

Deep3DSketch+: Obtaining Customized 3D Model by Single Free-Hand Sketch through Deep Learning

, 2023
As 3D models become critical in today's manufacturing and product design, conventional 3D modeling approaches based on Computer-Aided Design (CAD) are labor-intensive, time-consuming, and have high demands on the creators.
Chen, Tianrun +5 more
core

Modality-invariant and Specific Prompting for Multimodal Human Perception Understanding

, 2023
Understanding human perceptions presents a formidable multimodal challenge for computers, encompassing aspects such as sentiment tendencies and sense of humor. While various methods have recently been introduced to extract modality-invariant and specific
Chen, Yen-Wei +5 more
core

AIMDiT: Modality Augmentation and Interaction via Multimodal Dimension Transformation for Emotion Recognition in Conversations

Emotion Recognition in Conversations (ERC) is a popular task in natural language processing, which aims to recognize the emotional state of the speaker in conversations.
Dang, Jianwu +5 more
core

Perceptual-oriented Learned Image Compression with Dynamic Kernel

In this paper, we extend our prior research named DKIC and propose the perceptual-oriented learned image compression method, PO-DKIC. Specifically, DKIC adopts a dynamic kernel-based dynamic residual block group to enhance the transform coding and an ...
Chen, Zhenzhong +3 more
core

Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection

Audio-visual deepfake detection scrutinizes manipulations in public video using complementary multimodal cues. Current methods, which train on fused multimodal data for multimodal targets face challenges due to uncertainties and inconsistencies in ...
Chen, Chen +5 more
core

A Subjective Quality Evaluation of 3D Mesh with Dynamic Level of Detail in Virtual Reality

3D meshes are one of the main components of Virtual Reality applications. However, many network and computational resources are required to process 3D meshes in real-time. A potential solution to this challenge is to dynamically adapt the Level of Detail
Hien, Tran Thuy, Huong, Truong Thu, Nguyen, Duc +2 more
core

Tile-Weighted Rate-Distortion Optimized Packet Scheduling for 360$^\circ$ VR Video Streaming

A key challenge of 360$^\circ$ VR video streaming is ensuring high quality with limited network bandwidth. Currently, most studies focus on tile-based adaptive bitrate streaming to reduce bandwidth consumption, where resources in network nodes are not ...
Dong, Haiwei +2 more
core +1 more source