Multimodal text - Open Access .click

Results 101 to 110 of about 245,725 (289)

Auditory–Tactile Congruence for Synthesis of Adaptive Pain Expressions in RoboPatients

Advanced Robotics Research, EarlyView.
In this work, we explore auditory–tactile congruence for synthesizing adaptive vocal pain expressions in robopatients. Using a robopatient platform that integrates vocal pain sounds with palpation forces, we conducted 7680 trials across 20 participants.
Saitarun Nadipineni +4 more
wiley +1 more source

ChicGrasp: Imitation‐Learning‐Based Customized Dual‐Jaw Gripper Control for Manipulation of Delicate, Irregular Bio‐Products

Advanced Robotics Research, EarlyView.
Automated poultry processing lines still rely on humans to lift slippery, easily bruised carcasses onto a shackle conveyor. Deformability, anatomical variance, and hygiene rules make conventional suction and scripted motions unreliable. We present ChicGrasp, an end‐to‐end hardware‐software co‐designed imitation learning framework, to offer a ...
Amirreza Davar +8 more
wiley +1 more source

Universal Gripper for Industrial Manipulation With Enhanced Rigid Mechanics and Self‐Adaptable Fingers

Advanced Robotics Research, EarlyView.
An enhanced universal gripper combining rigid mechanics with self‐adaptable fingers is presented for industrial automation. The novel six‐bar linkage with integrated compliant pad eliminates mechanical interference while enabling passive shape adaptation.
Muhammad Usman Khalid +7 more
wiley +1 more source

A text classification method based on multimodal fusion enhancement

大数据
Although multimodal text classification techniques have potential when applied to specific scenarios, there are still some limitations.Existing multimodal fusion models require modal alignment in the input data, resulting in a large amount of incomplete ...
Dezhi LIU, Liu HE, Youfeng LIU, Dechun HAN +3 more
doaj

Multimodal Transformer for Comics Text-Cloze

This work explores a closure task in comics, a medium where visual and textual elements are intricately intertwined. Specifically, Text-cloze refers to the task of selecting the correct text to use in a comic panel, given its neighboring panels. Traditional methods based on recurrent neural networks have struggled with this task due to limited OCR ...
Emanuele Vivoli +3 more
openaire +2 more sources

Origami‐Inspired Structural Design for Aquatic‐Terrestrial Amphibious Robots

Advanced Robotics Research, EarlyView.
This work presents a lightweight amphibious origami robot actuated by a single shape memory alloy wire. A rigid foldable origami structure with displacement amplification enables efficient terrestrial crawling and aquatic swimming. The addition of fan‐shaped units allows controllable turning in both environments.
Weiqi Liu +5 more
wiley +1 more source

FACTMS: Enhancing Multimodal Factual Consistency in Multimodal Summarization

Applied Sciences
Multimodal summarization (MS) generates text summaries from multimedia articles with textual and visual content. Therefore, MS can suffer from the multimodal factual inconsistency problem, where the generated summaries may distort or deviate from both ...
Mai Zhang, Hao Yan, Chaozhuo Li
doaj +1 more source

Multimodal Machine Learning for Automated ICD Coding

, 2019
This study presents a multimodal machine learning model to predict ICD-10 diagnostic codes. We developed separate machine learning models that can handle data from different modalities, including unstructured text, semi-structured text and structured ...
Band, Charlotte +11 more
core

Robotic Control for Human–Robot Collaborative Assembly Based on Digital Human Model and Reinforcement Learning

Advanced Robotics Research, EarlyView.
This work presents a robotic control method for human–robot collaborative assembly based on a biomechanics‐constrained digital human model. Reinforcement learning is used to generate physiologically plausible human motion trajectories, which are integrated into a virtual environment for robot control learning.
Bitao Yao +4 more
wiley +1 more source

Multimodal Human–Robot Interaction Using Human Pose Estimation and Local Large Language Models

Advanced Robotics Research, EarlyView.
A multimodal human–robot interaction framework integrates human pose estimation (HPE) and a large language model (LLM) for gesture‐ and voice‐based robot control. Speech‐to‐text (STT) enables voice command interpretation, while a safety‐aware arbitration mechanism prioritizes gesture input for rapid intervention.
Nasiru Aboki, Ilche Georgievski, Marco Aiello +2 more
wiley +1 more source

4. education
multimodality
literacy

attention mechanism
10. no inequality
3. good health

multimodal learning
comics
medicine