多智能体强化学习方法综述
作者:
作者单位:

作者简介:

陈人龙,男,1995年生,博士研究生,研究方向为多智能体强化学习、群体机器人,E-mail:reo@pku.edu.cn

通讯作者:

中图分类号:

TN915

基金项目:

国家重点研发计划项目(2018AAA0102301);国家自然科学基金资助项目(62250037, 62276008, 62076010)


A survey of multi-agent reinforcement learning methods
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在自动驾驶、团队配合游戏等现实场景的序列决策问题中,多智能体强化学习表现出了优秀的潜力。然而,多智能体强化学习面临着维度灾难、不稳定性、多目标性和部分可观测性等挑战。为此,概述了多智能体强化学习的概念与方法,并整理了当前研究的主要趋势和研究方向。研究趋势包括CTDE范式、具有循环神经单元的智能体和训练技巧。主要研究方向涵盖混合型学习方法、协同与竞争学习、通信与知识共享、适应性与鲁棒性、分层与模块化学习、基于博弈论的方法以及可解释性。未来的研究方向包括解决维度灾难问题、求解大型组合优化问题和分析多智能体强化学习算法的全局收敛性。这些研究方向将推动多智能体强化学习在实际应用中取得更大的突破。

    Abstract:

    In real-world scenarios such as autonomous driving and team-based cooperative games, multi-agent reinforcement learning has demonstrated significant potential in tackling sequential decision-making problems. However, it also encounters challenges including the curse of dimensionality, instability, multi-objectivity, and partial observability. This article offers an overview of the concepts and methods employed in multi-agent reinforcement learning, providing a summary of the prevailing trends and research directions in the current studies. The identified research trends comprise the CTDE paradigm, agents equipped with recurrent neural units, and various training techniques. The primary research directions encompass hybrid learning methods, cooperative and competitive learning, communication and knowledge sharing, adaptability and robustness, hierarchical and modular learning, game theoretic approaches, and interpretability. Looking ahead, future research directions entail addressing the curse of dimensionality, solving large-scale combinatorial optimization problems, and conducting analyses on the global convergence of multi-agent reinforcement learning algorithms. Pursuing these research directions will significantly contribute to further breakthroughs in the practical application of multi-agent reinforcement learning.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-02-22
  • 最后修改日期:2023-05-04
  • 录用日期:
  • 在线发布日期: 2024-01-31
  • 出版日期: