Results 1 to 10 of about 4,309 (148)

GenFedRL: a general federated reinforcement learning framework for deep reinforcement learning agents [PDF]

open access: yes, 2023
To solve the problem that intelligent devices equipped with deep reinforcement learning agents lack effective security data sharing mechanisms in the intelligent Internet of things, a general federated reinforcement learning (GenFedRL) framework was ...
Biao JIN   +4 more
core   +1 more source

群视角下的多智能体强化学习方法综述

open access: yes智能科学与技术学报, 2023
多智能体系统是分布式人工智能领域的前沿研究概念,传统的多智能体强化学习方法主要聚焦群体行为涌现、多智能体合作与协调、智能体间交流与通信、对手建模与预测等主题,但依然面临环境部分可观、对手策略非平稳、决策空间维度高、信用分配难理解等难题,如何设计满足智能体数量规模比较大、适应多类不同应用场景的多智能体强化学习方法是该领域的前沿课题。首先简述了多智能体强化学习的相关研究进展;其次着重从规模可扩展与种群自适应两个视角对多种类、多范式的多智能体学习方法进行了综合概述归纳,系统梳理了集合置换不变性、注意力机制 ...
项凤涛, 罗俊仁, 谷学强, 苏炯铭, 张万鹏
doaj   +1 more source

Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning [PDF]

open access: yes, 2022
In order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the ...
Fengyu WANG   +5 more
core   +1 more source

知识增强策略引导的交互式强化推荐系统

open access: yes大数据, 2022
推荐系统是解决社会媒体信息过载问题的重要手段。为了解决传统推荐系统无法优化用户长期体验的问题,研究人员提出了交互式推荐系统,并尝试使用深度强化学习优化推荐策略。但是,强化推荐算法面临反馈稀疏、从零学习影响用户体验、物品空间大等问题。为了解决上述问题,提出一种改进的知识增强策略引导的交互式强化推荐模型KGP-DQN。该模型构建行为知识图谱表示模块,将用户历史行为和知识图谱结合,解决反馈稀疏问题;构建策略初始化模块,根据用户历史行为为强化推荐系统提供初始化策略,解决从零学习影响用户体验的问题 ...
张宇奇, 黄晓雯, 桑基韬
doaj   +1 more source

Edge intelligence-assisted routing protocol for Internet of vehicles via reinforcement learning [PDF]

open access: yes, 2023
To achieve a highly reliable and adaptive packet routing protocol in a complex urban Internet of vehicles, an end-edge-cloud edge intelligence architecture was proposed which consisted of an end user layer, an edge collaboration layer, and a cloud ...
Bingyi LIU   +5 more
core   +1 more source

基于强化自组织映射和径向基神经网络的短期负荷预测 [PDF]

open access: yes全球能源互联网, 2019
径向基(radial basis function,RBF)神经网络因其泛化能力强、收敛速度快的特点广泛应用于负荷预测。但传统采用K-means和自组织映射(self-organizing map,SOM)训练RBF径向基中心的方法因其全局搜索能力偏弱,仍然存在容易陷入局部最优解的问题,严重制约了RBF预测精度的提高。针对此问题,提出了一种基于强化学习(reinforcement learning,RL)改进的RBF短期负荷预测方法。强化学习通过环境的反馈不断完善搜索策略,具有非常突出的全局搜索能力 ...
黄乾, 马开刚, 韦善阳, 黎静华
doaj   +1 more source

基于群体熵的机器人群体智能汇聚度量

open access: yes智能科学与技术学报, 2022
群体行为往往能产生远超个体行为的价值和复杂度。为了在个体智能的基础上更有效地衍生出群体智能,需要基于群体熵来科学地衡量群体智能水平,并以群体熵为引导目标,推动群体智能的增强和演进。针对这个重要的科学问题,以无人小车群体为研究对象,提出基于参数共享和群体策略熵的多智能体soft Q learning算法,通过共享智能体的观测信息,并结合最大熵强化学习方法,实现探索型任务中群体策略的持续学习更新。同时,通过将群体熵定义为度量工具,刻画群体学习中熵变化模式,实现对群智汇聚过程的定量分析。
冯埔   +4 more
doaj   +1 more source

Multi-agent reinforcement learning based dynamic optimization algorithm of CRE offset for heterogeneous networks [PDF]

open access: yes, 2023
To cope with the high throughput demand caused by the proliferation of wireless network users, a multi-agent reinforcement learning based dynamic optimization algorithm of cell range expansion (CRE) offset was proposed for interference scenarios in macro-
Cheng ZHANG   +3 more
core   +1 more source

Reinforcement learning-based detection method for malware behavior in industrial control systems [PDF]

open access: yes, 2020
Due to the popularity of intelligent mobile devices, malwares in the internet have seriously threatened the security of industrial control systems. Increasing number of malware attacks has become a major concern in the information security community ...
Feng XIE   +7 more
core   +1 more source

Security decision method for the edge of multi-layer satellite network based on reinforcement learning [PDF]

open access: yes, 2022
Objectives: Multi-layer satellite network is an important component of space-ground integration technology.The purpose of this paper is to rely on the autonomous decision ability of satellite nodes to give full play to the processing and backhaul tasks ...
Chao GUO   +4 more
core   +1 more source

Home - About - Disclaimer - Privacy