AIRS in the AIR
AIRS in the AIR | 多智能体强化学习

十月,AIRS in the AIR 邀请国内外顶级学者围绕机器学习与优化方法及其应用开展讲座。系列活动第三期主题为“多智能体强化学习”。
第一位报告嘉宾张崇洁是清华大学交叉信息研究院助理教授,他多次在 NeurIPS、IJCAI、AAAI 等人工智能、机器学习领域顶会发表文章。
第二位报告嘉宾卢宗青是北京大学计算机学院助理教授、人工智能研究院研究员,他担任 NeurIPS、ICLR、CoRL、IJCAI、AAMAS 等会议 TPC,Nature Machine Intelligence 等审稿人。
点击链接报名参加:http://hdxu.cn/kcdMJ,或通过ZOOM(https://us02web.zoom.us/meeting/register/tZIoceiuqzgjHdLy-QixX_KJbVxI3sKbuKK-)/Bilibili(http://live.bilibili.com/22587709)参与。
呼吸新鲜空气,了解前沿科技!AIRS 重磅推出 系列活动 AIRS in the AIR。每周二与您相约线上,一起探索人工智能与机器人领域的前沿技术、产业应用、发展趋势。
-
查宏远香港中文大学(深圳)校长学勤讲座教授、数据科学学院执行院长、AIRS 机器学习与应用中心主任执行主席
-
王趵翔香港中文大学(深圳)数据科学学院助理教授、AIRS 机器学习与应用中心副研究员主持人
-
张崇洁清华大学交叉信息研究院助理教授Cooperative Multi-Agent Reinforcement Learning with Factored Value Functions
张崇洁,清华大学交叉信息研究院助理教授,博士生导师。2011年在美国麻省大学阿默斯特分校获计算机科学博士学位,后在麻省理工学院从事博士后研究。目前的研究兴趣主要在人工智能、强化学习、多智能体系统等领域。
Collaboration is indispensable for solving complex tasks. Learning to collaborate effectively is one of the key problems in artificial intelligence. Cooperative multi-agent reinforcement learning (MARL) potentially provides a promising solution, but faces two fundamental challenges: scalability and credit assignment. In this talk, I will discuss a MARL paradigm with factored value functions to address these challenges. I will first present formal analysis on factored value learning, revealing its implicit credit assignment mechanism and properties of convergence and optimality. Inspired by these theoretical insights, two novel MARL methods will then be introduced with linear and non-linear value factorization, respectively, which achieves state-of-the-art performance. Building on factored MARL, I will also briefly discuss approaches for addressing other challenges of cooperative MARL, such as learning efficiency, partial observability, and exploration.
-
卢宗青北京大学计算机学院助理教授Fully Decentralized Multi-Agent Reinforcement Learning
Zongqing Lu is currently a BOYA assistant professor in School of Computer Science at Peking University. He is also affiliated with Institute of AI at Peking University and Beijing Academy of Artificial Intelligence. His current research focuses on reinforcement learning and AI systems.
The main research of multi-agent reinforcement learning (MARL) focuses on the paradigm of centralized training with decentralized execution (CTDE). Another potential paradigm is fully decentralized learning, which is less investigated but better for robustness, scalability, and generalibility. However, current fully decentralized learning methods do not even have convergence guarantee. In this talk, I will present our recent studies on fully decentralized learning algorithms with convergence guarantee, including value-based, actor-critic, and model-based methods.
时间 | 环节 | 嘉宾与题目 |
---|---|---|
9:00-10:00 |
主题报告 |
张崇洁,清华大学 |
10:00-11:00 |
主题报告 |
卢宗青,北京大学 |
视频回顾