Generative AI-enabled Mobile Tactical Multimedia Networks: Distribution, Generation, and Perception
Mobile multimedia networks (MMNs) demonstrate great potential in delivering low-latency and high-quality entertainment and tactical applications, such as short-video sharing, online conferencing, and battlefield surveillance.
Fang, Yuguang+6 more
core
Deep3DSketch+: Obtaining Customized 3D Model by Single Free-Hand Sketch through Deep Learning
As 3D models become critical in today's manufacturing and product design, conventional 3D modeling approaches based on Computer-Aided Design (CAD) are labor-intensive, time-consuming, and have high demands on the creators.
Chen, Tianrun+5 more
core
User Digital Twin-Driven Video Streaming for Customized Preferences and Adaptive Transcoding
In the rapidly evolving field of multimedia services, video streaming has become increasingly prevalent, demanding innovative solutions to enhance user experience and system efficiency.
Berhane, Kalkidan+2 more
core
Anableps: Adapting Bitrate for Real-Time Communication Using VBR-encoded Video
Content providers increasingly replace traditional constant bitrate with variable bitrate (VBR) encoding in real-time video communication systems for better video quality.
Cao, Xun+3 more
core
Perceptual-oriented Learned Image Compression with Dynamic Kernel
In this paper, we extend our prior research named DKIC and propose the perceptual-oriented learned image compression method, PO-DKIC. Specifically, DKIC adopts a dynamic kernel-based dynamic residual block group to enhance the transform coding and an ...
Chen, Zhenzhong+3 more
core
Emotion Recognition in Conversations (ERC) is a popular task in natural language processing, which aims to recognize the emotional state of the speaker in conversations.
Dang, Jianwu+5 more
core
A Subjective Quality Evaluation of 3D Mesh with Dynamic Level of Detail in Virtual Reality
3D meshes are one of the main components of Virtual Reality applications. However, many network and computational resources are required to process 3D meshes in real-time. A potential solution to this challenge is to dynamically adapt the Level of Detail
Hien, Tran Thuy+2 more
core
Tile-Weighted Rate-Distortion Optimized Packet Scheduling for 360$^\circ$ VR Video Streaming
A key challenge of 360$^\circ$ VR video streaming is ensuring high quality with limited network bandwidth. Currently, most studies focus on tile-based adaptive bitrate streaming to reduce bandwidth consumption, where resources in network nodes are not ...
Dong, Haiwei+2 more
core +1 more source
Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection
Audio-visual deepfake detection scrutinizes manipulations in public video using complementary multimodal cues. Current methods, which train on fused multimodal data for multimodal targets face challenges due to uncertainties and inconsistencies in ...
Chen, Chen+5 more
core
Convex-hull Estimation using XPSNR for Versatile Video Coding
As adaptive streaming becomes crucial for delivering high-quality video content across diverse network conditions, accurate metrics to assess perceptual quality are essential.
Bross, Benjamin+4 more
core