AIRS in the AIR
AIRS in the AIR | 关于Random Reshuffling (RR)算法的收敛性分析

近日,AIRS 机器学习与应用中心的两篇论文分别被信号处理领域国际顶刊 IEEE Transactions on Signal Processing 和数学优化领域国际顶刊 SIAM Journal on Optimization 接收。
本期 AIRS in the AIR,我们邀请两位论文作者带来最全面的论文解读,谈谈关于 Random Reshuffling (RR) 算法的收敛性分析。
第一位报告嘉宾黄琨是香港中文大学(深圳)数据科学学院博士生,曾在 AIRS 机器学习与应用中心担任研究助理。2018年获同济大学数学与应用数学学士学位,2020年获康涅狄格大学统计学硕士学位。他的研究兴趣包括分布式优化。
第二位报告嘉宾邱俊文是香港中文大学(深圳)数据科学学院博士生,也曾在 AIRS 机器学习与应用中心担任研究助理。2019年获暨南大学信息管理与信息系统学士学位。他的研究兴趣包括非凸随机优化。
通过Bilibili(http://live.bilibili.com/22587709)参与。
呼吸新鲜空气,了解前沿科技!AIRS in the AIR 为 AIRS 重磅推出的系列活动,与您一起探索人工智能与机器人领域的前沿技术、产业应用、发展趋势。
-
濮实香港中文大学(深圳)数据科学学院助理教授、AIRS 副研究员执行主席
-
黄琨香港中文大学(深圳)数据科学学院博士生Distributed Random Reshuffling over Networks
黄琨于2018年获同济大学数学科学学院数学与应用数学学士学位,2020年获康涅狄格大学统计学硕士学位,目前在香港中文大学(深圳)数据科学学院攻读数据科学博士学位,曾在AIRS机器学习与应用中心担任研究助理。他的研究兴趣包括分布式优化。其工作曾在IEEE Transactions on Automatic Control发表。
In this paper, we consider distributed optimization problems where n agents, each possessing a local cost function, collaboratively minimize the average of the local cost functions over a connected network. To solve the problem, we propose a distributed random reshuffling (D-RR) algorithm that invokes the random reshuffling (RR) update in each agent. We show that D-RR inherits favorable characteristics of RR for both smooth strongly convex and smooth nonconvex objective functions. In particular, for smooth strongly convex objective functions, D- RR achieves O(1/T2) rate of convergence (where T counts the epoch number) in terms of the squared distance between the iterate and the global minimizer. When the objective function is assumed to be smooth nonconvex, we show that D-RR drives the squared norm of the gradient to 0 at a rate of O(1/T2/3). These convergence results match those of centralized RR (up to constant factors) and outperform the distributed stochastic gradient descent (DSGD) algorithm if we run a relatively large number of epochs. Finally, we conduct a set of numerical experiments to illustrate the efficiency of the proposed D-RR method on both strongly convex and nonconvex distributed optimization problems.
-
邱俊文香港中文大学(深圳)数据科学学院博士生Convergence of Random Reshuffling under the KL Inequality
邱俊文于2019年获暨南大学信息管理与信息系统学士学位,目前在香港中文大学(深圳)数据科学学院攻读数据科学博士学位,曾在AIRS机器学习与应用中心担任研究助理。他的研究兴趣包括非凸随机优化。
In this talk, we present novel convergence properties of the random reshuffling (RR) method for smooth nonconvex optimization problems with a finite-sum structure. Though this method is widely utilized in practice, e.g., in the training of neural networks, its convergence behavior is only understood in several limiting cases. Based on the Kurdyka-Lojasiewicz (KL) inequality, we verify that the whole sequence of iterates generated by RR is convergent and converges to a single stationary point of problem in an almost sure sense. Corresponding rates of convergence are derived which depend on the KL exponent and suitably selected diminishing step sizes. When the KL exponent lies in $[0,\frac12]$, the convergence is at a rate of $\mathcal{O}(t^{-1})$ with $t$ counting the number of iterations. The standard KL inequality-based convergence framework only applies to algorithms with a certain descent property. We conduct a novel analysis for the non-descent RR method with diminishing step sizes based on the KL inequality which generalizes the standard KL framework and techniques.
时间 | 环节 | 嘉宾与题目 |
---|---|---|
10:00-10:40 |
主题报告 |
黄琨,香港中文大学(深圳) |
10:40-11:30 |
主题报告 |
邱俊文,香港中文大学(深圳) |
视频回顾