Results 71 to 80 of about 844 (118)
针对单条文本描述生成的图像质量不高且存在结构错误的问题进行研究,采用多阶段生成对抗网络模型,并提出对不同文本序列进行插值操作,从多条文本描述中提取特征,以丰富给定的文本描述,使生成图像具有更多细节。为了生成与文本更为相关的图像,引入了多文本深度注意多模态相似度模型以得到注意力特征,并与上一层视觉特征联合作为下一层的输入,从而提升生成图像的真实程度和文本描述之间的语义一致性。为了能够让模型学会协调每个位置的细节,引入了自注意力机制,让生成器生成更加符合真实场景的图像。优化后的模型在CUB和MS ...
聂开琴, 倪郑威
doaj +1 more source
Modeling and Predicting Time Series with Non-stationarity and Volatility [PDF]
The difficulty of time series prediction lies in how to handle non-stationarity and volatility. When dealing with non-stationarity, existing deep learning models adopt a method of stabilizing the input sequences before training, which has problems of ...
FENG Qiang, ZHAO Jianguang, YANG Rong, NIU Baoning
core +1 more source
野生动物作为生态系统的重要组成部分,其动态监测对于维系生态平衡、理解物种间相互作用及评估生态系统健康状况具有至关重要的意义。野生动物监测主要通过无人机机载相机和固定的红外相机来捕捉动物的自然行为。然而,由于野生动物行为的不可预测性,在实际跟踪过程中,常会出现目标较小、多尺度变化以及动物身体被遮挡等问题。为了应对这些挑战,提出一种基于改进孪生网络的动物目标跟踪方法,将跟踪问题转化为相似性学习问题。在孪生关系网络(SiamRN)的特征提取阶段引入多头注意力机制,包括串联窗口自注意力运算和滑动窗口自注意力运算,
殷子璇 +5 more
doaj
Review of Attention Mechanisms in Image Processing [PDF]
Attention mechanism in image processing has become one of the popular and important techniques in the field of deep learning, and is widely used in various deep learning models in image processing because of its excellent plug-and-play convenience.
QI Xuanhao, ZHI Min
core +1 more source
Semantic Segmentation Algorithm Based on Multi-Attention Mechanism and Cross-Feature Fusion [PDF]
Image semantic segmentation is widely used in defect detection, medical diagnosis, and unmanned driving. To address the common problems of existing semantic segmentation models, such as their high training costs, poor target contour segmentation, small ...
Li MIN, Bingjie DONG, Dong AN
core +1 more source
随着工业物联网和智慧城市的快速发展,安全监控系统需要处理文本、图像、视频、传感器等多模态数据。现有跨模态注意力模型存在计算复杂度高、对未知威胁感知能力弱、模型缺乏自适应演进能力三大挑战。提出一种面向多模态的自演进高效注意力异常检测机制,通过构建分层注意力架构,引入卷积操作增强局部建模能力,并利用门控网络动态调节计算路径。同时,设计基于知识蒸馏的演进系统,使模型能够持续从新数据中学习。实验结果表明,该机制将计算复杂度从平方级降至线性级,在异常检测任务中对未知威胁的检出率提升约30 ...
郝明诗 +3 more
doaj
Multimodal Sentiment Analysis Based on Cross-Modal Semantic Information Enhancement [PDF]
With the development of social networks, humans express their emotions in different ways, including text, vision and speech, i.e., multimodal. In response to the failure of previous multimodal sentiment analysis methods to effectively obtain multimodal ...
LI Mengyun, ZHANG Jing, ZHANG Huanxiang, ZHANG Xiaolin, LIU Luyao
core +1 more source
Speech Enhancement Network Based on Parallel Multi-Attention [PDF]
Regarding the issue of the frequency-domain enhancement of speech affected by interference, a speech enhancement network based on a parallel multi-attention mechanism and an encoding and decoding structure, known as PMAN, is proposed.
ZHANG Chi, WANG Zhong, JIANG Tianhao, XIE Kangmin
core +1 more source
Operation standardization evaluation method based on improved YOLOv8n for ship equipment disassembly and assembly [PDF]
ObjectivesThe standardization of ship engine room operations is a critical component of ship safety management. Therefore, the practical examination for crew members includes the disassembly and assembly of ship equipment as a key assessment item.
Chao WU +4 more
core +1 more source
为提高换流站智能化水平,充分利用巡检机器人硬件能力,文中基于变压器声音信息提出一种直流偏磁声纹识别方法,该方法可以无需特定降噪算法应对瞬态和稳态噪声。首先,对变压器声音信号进行分析;其次,为提高有效声音信息权重,结合变压器声音信号特点,使用W-50FMCC特征来表征声音信息;再次,基于多头注意力机制和残差结构设计了非自回归端到端偏磁声纹识别模型,使用信道补偿算法提升特征正交性并获取相似度得分,完成识别;最后通过实验进行了验证。实验表明,该方法可以直接对变压器声学信号的偏磁情况进行准确的识别,无需降噪算法。
刘建华 +5 more
doaj

