Results 61 to 70 of about 5,206 (257)
新闻推荐系统对新媒体新闻传播有着重要作用。提出了一种以深度强化学习为基础的推荐系统,旨在结合神经网络的表征能力和强化学习的策略选择能力来提升新闻推荐效果。使用动态动作掩码加强对用户短期兴趣的判断能力,使用优化缓存机制提升经验缓存的使用效率,通过区域遮蔽性质的奖励设计加快模型训练,从而提高推荐系统在新闻推荐领域的表现。实验表明,所提模型在新闻数据集上的推荐准确率与主流的神经网络推荐方法相当,且在排序性能上优于当前先进的推荐算法。
董相宏, 安俊秀
doaj +1 more source
ABSTRACT This study examines the paradoxical relationship between policy learning and capacity: governments need certain capacities to learn effectively, yet these same capacities often emerge from previous learning experiences. Through a comparative analysis of Hong Kong and Singapore's responses to SARS and COVID‐19, we demonstrate how policy ...
Shubham Sharma, Xun Wu, Gleb Papyshev
wiley +1 more source
通过强化学习研究了异构多智能体系统的输出同步问题。根据多智能体系统的拓扑结构,定义一个具有邻居控制输入的性能指标和价值函数。为克服已有控制方法需要系统模型的弊端,提出一个基于系统数据的强化学习算法,使输出同步控制器也可以被应用于模型未知的情况。此外,通过调节价值函数中的权重矩阵,可以减少每个智能体的控制成本。最后,通过一个仿真示例验证了该方法的有效性和定义的价值函数的优越性。
刘莹莹, 王占山
doaj
Summary Podocarpus pollen morphology is shaped by both phylogenetic history and the environment. We analyzed the relationship between pollen traits quantified using deep learning and environmental factors within a comparative phylogenetic framework.
Marc‐Élie Adaimé+4 more
wiley +1 more source
Advantage estimator based on importance sampling [PDF]
In continuous action tasks,deep reinforcement learning usually uses Gaussian distribution as a policy function.Aiming at the problem that the Gaussian distribution policy function slows down due to the clipped action,an importance sampling advantage ...
Quan LIU, Yubin JIANG, Zhihui HU
core +1 more source
Maritime mobile edge computing offloading method based on deep reinforcement learning [PDF]
The strong heterogeneity among the network nodes of the maritime information system brings complex and high-dimensional constraints for optimizing task offloading of the maritime mobile edge computing.The complex and diverse maritime applications also ...
Leilei MENG+3 more
core +1 more source
在半监督网络表示学习中,节点标签对于网络在不同空间中映射关系的建立具有重要指导意义。然而在很多实际任务中,可用标签信息往往比较有限或难以获取,这导致在学习网络低维表示的过程中无法提供充分有效的监督。针对这一问题,提出了一种双通道半监督网络表示学习模型,该模型以自编码器为基本框架,由自监督和半监督两个信息传递通道构成。自监督信号与标签信息分别在两个通道中对网络表示映射关系的建立提供指导,同时二者之间形成信息互补与增强。考虑到两个通道间可能存在信息冗余,在互信息视角下设计了冗余识别与消除机制。在此基础上 ...
杜航原, 谢富中, 王文剑, 白亮
doaj +1 more source
Plant order‐level Sankey plot illustrating plant use of ethnolinguistic groups for key disease types based on organ systems and use. Each node represents the strength of its interaction or usage. ABSTRACT Many human populations rely on natural remedies for health and healing, with traditional medicinal plants playing a vital role in diverse ...
Krizler C. Tanalgo+15 more
wiley +1 more source
Research on power allocation of integrated VLPC based on deep reinforcement learning [PDF]
A power allocation scheme for integrated visible light position and communication (VLPC) system based on deep reinforcement learning was proposed to achieve power allocation for communication positioning integration.First, the frame structure design of ...
Bing LI+7 more
core +1 more source
作业车间环境是一个高动态、强耦合的复杂系统,一种调度算法无法做到一次训练而终身使用,需要结合车间环境、作业过程、作业任务进行计算实验和渐进式学习。针对作业车间提出了一种基于平行系统理论的平行车间调度模型,实现了调度算法优化和系统演化。针对该模型,阐述了基于多智能体的人工车间调度系统建模、面向车间调度任务的计算实验方法,以及面向虚实系统的平行调度方法,实现了人工系统和实际系统之间的闭环控制,迭代优化。最后,在平行车间调度模型的指导下,设计了平行车间调度系统的技术架构和基本功能。
彭绍明, 熊刚, 沈震, 董西松, 曲之平, 付龙, 陶志坤, 韩云君
doaj +1 more source