Dec 04-06, 2020
- CUHK-Shenzhen, ZOOM Cloud Meeting
The 2020 Workshop on Optimization, Probability, and Simulation (WOPS20) is a parallel virtual—on-site single stream-event that will be held on the campus of The Chinese University of Hong Kong, Shenzhen (CUHK-SZ), from December 4th to December 6th, 2020.
WOPS20 is an international workshop that aims to provide a platform for young and top researchers from China and abroad to exchange ideas, to explore opportunities for collaboration, and to discuss recent research results.
The workshop is organized by the Shenzhen Institute of Artificial Intelligence and Robotics for Society (AIRS).
AIRS Principal Investigator, Assistant Professor in School of Data Science of The Chinese University of Hong Kong, Shenzhen
Welcome and Opening Speech
Professor Andre Milzarek is a principal investigator in AIRS and an assistant professor at the School of Data Science, CUHK(SZ). Before that, he was a postdoctoral researcher at the Beijing International Center for Mathematical Research at the Peking University. He received his master's degree with honours and his doctoral degree in Mathematics from the Technical University of Munich in Germany under the supervision of Professor Michael Ulbrich in 2013 and 2016, respectively.
His main research interests cover nonsmooth optimization, large-scale and stochastic optimization, second order methods and theory. He was invited to deliver talks at international conferences such as the AISM conference on Applied Linear Algebra and International Symposium on Mathematical Programming, and has publications on peer-reviewed journals. From 2010 to 2012 he was selected to the Max-Weber program of the State of Bavaria,and in 2017 he received the Boya Postdoctoral Fellowship at Peking University.
Prof. Andre Milzarek will give a welcome and opening speech of WOPS20.
Junior Professor (W1) for Mathematical Statistics, Georg- August-Universität Göttingen
Spectral Gap of Slice Sampling
Advanced academic qualifications
2011 Dr. rer nat. in Mathematics, Friedrich-Schiller-Universität
Jena, Advisor: E. Novak
Postgraduate professional career
since 2016 Junior Professor (W1) for Mathematical Statistics, Georg-August-Universität Göttingen
2016 Research Associate, Friedrich-Schiller-Universität Jena
2015–2016 Substituting the Professor for Mathematical Economics,Technische Universität Chemnitz
2013–2015 Research Associate, Friedrich-Schiller-Universität Jena
2012–2013 Postdoctoral Research Fellow, The University of New South Wales, Sydney
2011–2012 Postdoctoral Researcher, Friedrich-Schiller-Universität Jena
We provide results on Wasserstein contraction of simple slice sampling for approximate sampling with respect to distributions with log-concave and rotational invariant Lebesgue densities. This yields, in particular, an explicit quantitative lower bound of the spectral gap of simple slice sampling. Moreover, this lower bound carries over to more general target distributions depending only on the volume of the (super-)level sets of their unnormalized density. This allows us to deduce convergence results of hybrid slice sampling approaches.
Elliptical Slice Sampler and its Convergence
Assistant Professor in the Department of Mathematical Sciences at Seoul National University
Minimax Optimization in Machine Learning
Ernest Ryu is an assistant professor in the Department of Mathematical Sciences at Seoul National University. He is an affiliated faculty of the Interdisciplinary Program of Computational Science and Technology and the Graduate School of Data Science. His current research focus is on convex optimization and machine learning.
Professor Ryu received a BS degree in Physics and Electrical engineering with honor at the California Institute of Technology in 2010 and an MS in Statistics and PhD in Computational and Mathematical Engineering at Stanford University in 2016. In 2016, he joined the Department of Mathematics at the University of California, Los Angeles as an Assistant Adjunct Professor. In 2020 he joined the Department of Mathematical Sciences at Seoul National University as a tenure-track faculty.
In this talk, we will talk about generative adversarial networks and convergence guarantees in stochastic „minimax“ optimization under the assumption of convex-concavity. Specifically, we establish almost sure convergence and L2 convergence in the last-iterates and present optimal accelerated complexity of minimizing gradients in the noiseless setting.
Technische Universität Bergakademie Freiberg
Noise-level Robust Monte Carlo Methods for Bayesian Inference with Informative Data
The Bayesian approach to inverse problems provides a rigorous framework for the incor-poration and quantification of uncertainties in measurements, parameters and models. However, sampling from or integrating w.r.t. the resulting posterior measure can become computationally challenging. In recent years, a lot of effort has been spent on deriving dimension-independent methods and to combine efficient sampling strategies with multilevel or surrogate methods in order to reduce the computational burden of Bayesian inverse problems.
In this talk, we are interested in designing numerical methods which are robust w.r.t. the size of the observational noise, i.e., methods which behave well in case of concentrated posterior measures. The concentration of the posterior is a highly desirable situation in practice, since it relates to informative or large data. However, it can pose as well a significant computational challenge for numerical methods based on the prior or reference measure. We propose to employ the Laplace approximation of the posterior as the base measure for numerical integration in this context. The Laplace approximation is a Gaussian measure centered at the maximum a-posteriori estimate (MAPE) and with covariance matrix depending on the Hessian of the log posterior density at the MAPE. We discuss convergence results of the Laplace approximation in terms of the Hellinger distance and analyze the efficiency of Monte Carlo methods based on it. In particular, we show that Laplace-based Metropolis-Hastings algorithms as well as Laplace-based importance sampling and quasi-Monte-Carlo are robust w.r.t. the concentration of the posterior for large classes of posterior distributions and integrands whereas prior-based Monte Carlo sampling methods are not.
Yingjie (Tom) Fei
Postdoc Researcher at Northwestern University
Exponential Rates in Community Detection and Reinforcement Learning
Tom is a postdoc researcher at Northwestern University hosted by Jorge Nocedal. Previously he received his PhD from Cornell University. His research interests lie in the intersection of machine learning, optimization and statistics. His work has been awarded the 2nd place overall (and 1st place in the US) in 2018 INFORMS George Nicholson Student Paper Competition.
In the first part of the talk, we will introduce the community detection problem under the stochastic block model. After discussing several popular algorithms, we will present an algorithm based on semidefinite programming (SDP). We show that despite being a relaxation, this algorithm achieves the optimal Bayes error rate in terms of distance to the target solution, which decays exponentially in the signal-to-noise ratio. Moreover, this algorithm is robust under the so-called semirandom model that frustrates many existing algorithms. Our proof involves a novel primal-dual analysis of SDP, which reveals that SDP tightly approximates the optimal (yet unimplementable) joint majority voting procedure.
In the second part of the talk, we will discuss the problem of risk-sensitive reinforcement learning under the episodic Markov decision process, in which an agent attempts to learn the optimal policy with respect to the entropic risk measure. We show that this problem admits a regret lower bound that scales exponentially in the risk sensitivity of the agent and the horizon length. We then propose an algorithm based on Q-learning and provide a regret upper bound for the algorithm that nearly matches the lower bound. Furthermore, the exponential regret is shown to represent a fundamental tradeoff between the risk-sensitivity of the agent and sample complexity of the algorithm.
Alibaba DAMO Academy, Seattle
Learning Prediction Intervals for Regression: Generalization and Calibration
Huajie Qian recently obtained his Ph.D. in Operations Research from Columbia University, advised by Henry Lam. His research borrows tools from statistics and machine learning to develop data-driven methodologies for stochastic simulation and optimization that can deal with uncertainties from data in an efficient and principled way. He is currently working at Alibaba DAMO Academy.
Accurate uncertainty and error quantification is crucial for building reliable machine learning systems and risk-based decision-making. Conventional methods, however, can suffer from over-conservativeness, inaccurate coverages and scalability issues. To overcome these, we study a recent framework to learn prediction intervals based on empirical constrained optimization that minimizes the average interval width while maintaining a correct coverage level. We establish statistical guarantees of this approach in relation to the complexity of our interval model. We also design a provably accurate validation procedure based on high-dimensional central limit theorems. We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
Assistant Professor in Faculty of Business and Economics, University of Hong Kong
Sample and Computationally Efficient Simulation Metamodeling in High Dimensions
Xiaowei Zhang is an assistant professor In Faculty of Business and Economics, University of Hong Kong. He received his PhD in Management Science and Engineering from Stanford University and B.S. in Mathematics from Nankai University. His recent research interests include high dimensional simulation and optimization.
Stochastic kriging has been widely employed for simulation metamodeling to predict the response surface of a complex simulation model. However, its use is limited to cases where the design space is low-dimensional, because the number of design points required for stochastic kriging to produce accurate prediction, in general, grows exponentially in the dimension of the design space. The large sample size results in both a prohibitive sample cost for running the simulation model and a severe computational challenge due to the need of inverting large covariance matrices. Based on tensor Markov kernels and sparse grid experimental designs, we develop a novel methodology that dramatically alleviates the curse of dimensionality. We show that the sample complexity of the proposed methodology grows very mildly in the dimension, even under model misspecification. We also develop fast algorithms that compute stochastic kriging in its exact form without any approximation schemes. We demonstrate via extensive numerical experiments that our methodology can handle problems with a design space of hundreds of dimensions, improving both prediction accuracy and computational efficiency by a significant margin relative to typical alternative methods in practice.
PhD candidate in the School of Mathematics and Physics at University of Queensland
Stability Analysis of Newton-MR under Hessian Perturbations
Yang Liu is a PhD candidate in the School of Mathematics and Physics at University of Queensland (UQ) and supervised by Fred Roosta. Yang Liu is a member of the ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS). He is awarded the UQ Research Training Scholarship during his PhD. He is also awarded the 2020 UQ Tutor Award and 2020 UQ student Publication Award.
Recently, stability of Newton-CG under Hessian perturbations, i.e., inexact curvature information, have been extensively studied. Such stability analysis has subsequently been leveraged in designing variants of Newton-CG in which, to reduce the computational costs involving the Hessian matrix, the curvature is suitably approximated. Here, we do that for Newton-MR, which extends Newton-CG in the same manner that MINRES extends CG, but can be applied beyond the traditional convex settings, to invex problems. Unlike the stability analysis of Newton-CG, which relies on spectrum preserving perturbations in the sense of Löwner partial order, our work here draws from matrix perturbation theory to estimate the distance between the underlying exact and perturbed sub-spaces. Numerical experiments demonstrate great degree of stability for Newton-MR, mounting to a highly efficient algorithm in large-scale problems.
Anthony Man-Cho So
Professor in the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong
Non-Convex Orthogonal Group Synchronization via the Generalized Power Method
Anthony Man-Cho So received his BSE degree in Computer Science from Princeton University with minors in Applied and Computational Mathematics, Engineering and Management Systems, and German Language and Culture. He then received his MSc degree in Computer Science and his PhD degree in Computer Science with a PhD minor in Mathematics from Stanford University. Dr. So joined The Chinese University of Hong Kong (CUHK) in 2007. He is now Professor in the Department of Systems Engineering and Engineering Management. His research focuses on optimization theory and its applications in various areas of science and engineering, including computational geometry, machine learning, signal processing, and statistics.
Dr. So is appointed as an Outstanding Fellow of the Faculty of Engineering at CUHK in 2019. He has received a number of research and teaching awards, including the 2018 IEEE Signal Processing Society Best Paper Award, the 2015 IEEE Signal Processing Society Signal Processing Magazine Best Paper Award, the 2014 IEEE Communications Society Asia-Pacific Outstanding Paper Award, the 2010 INFORMS Optimization Society Optimization Prize for Young Researchers, and the 2013 CUHK Vice-Chancellor's Exemplary Teaching Award.
The group synchronization problem calls for the estimation of a ground-truth vector from the noisy relative transforms of its elements, where the elements come from a group and the relative transforms are computed using the binary operation of the group. Such a problem provides an abstraction of a wide range of inverse problems that arise in practice. However, in many instances, one needs to tackle a non-convex optimization formulation. It turns out that for synchronization problems over certain subgroups of the orthogonal group, a simple projected gradient-type algorithm, often referred to as the generalized power method (GPM), is quite effective in finding the ground-truth when applied to their non-convex formulations. In this talk, we survey the related recent results in the literature and focus in particular on the techniques for analyzing the statistical and optimization performance of the GPM.
This talk covers joint works with Huikang Liu, Peng Wang, Man-Chung Yue, and Zirui Zhou.
University of Bristol
Quasi-Stationary Monte Carlo Methods via Stochastic Approximation
A new class of Monte Carlo algorithms, quasi-stationary Monte Carlo (QSMC), designed for exact Bayesian inference on large datasets, was recently presented in [Pollock et al (2020), JRSSB]. In my talk I will describe this class of methods, and the associated probabilistic concept of quasi-stationarity, which concerns the distribution of an extant stochastic population. I will present an alternative approach to QSMC based on stochastic approximation, rather than interacting particle systems, which is conceptually more straightforward and can be significantly more straightforward to implement, while retaining the desirable properties of exactness and scalability.
Joint work with Murray Pollock, Gareth Roberts, David Steinsaltz.
Assistant Professor of Columbia IEOR
Functional Inequalities of Infinite Swapping Algorithm: Theory and Application
Wenpin Tang is currently assistant professor of Columbia IEOR. He is working at the intersection of probability, statistics, machine learning and financial technology.
Sampling Gibbs measures at low temperature is a very important task but compu-tationally very challenging. Numeric evidence suggests that the infinite-swapping algorithm (isa) is a promising method. The isa can be seen as an improvement of replica methods which are very popular. We rigorously analyze the ergodic properties of the isa in the low temperature regime deducing Eyring-Kramers formulas for the spectral gap (or Poincaré constant) and the log-Sobolev constant. Our main result shows that the effective energy barrier can be reduced drastically using the isa compared to the classical over-damped Langevin dynamics. As a corollary we derive a deviation inequality showing that sampling is also improved by an exponential factor. Furthermore, we analyze simulated annealing for the isa and show that isa is again superior to the over-damped Langevin dynamics. This is joint work with Georg Menz, André Schlichting and Tianqi Wu.
Faculty Member in the School of Mathematics and Physics at the University of Queensland
Reproducing Stein Kernel Approach for Post-Hoc Correction of Approximate Sampling Algorithms
Fred Roosta is a data science faculty member in the School of Mathematics and Physics at the University of Queensland (UQ). He is also a research scientist at the International Computer Science Institute (ICSI) in Berkeley, California. In addition, he holds an associate investigator position with the ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS). In 2018, he was awarded the Discovery Early Career Researcher Award (DECRA) by the Australian Research Council for his research on Second-Order Optimization for Machine Learning. Prior to joining UQ, he was a post-doctoral fellow in the Department of Statistics and Department of Computer Science at the University of California, Berkeley, where he was supervised by Professor Michael Mahoney. He obtained his PhD from the University of British Columbia in 2015 under the supervision for Professor Uri Ascher.
We discuss a post-hoc rejection-free method of obtaining a consistent estimator for quantities related to a target distribution of interest by using samples obtained from an ergodic Markov chain with an arbitrary stationary distribution. The approach involves Stein importance sampling, based on minimization of the kernelized Stein discrepancy, and is shown to be valid under certain conditions on the mixing of the chain. To demonstrate the practical implications of the method, we show these conditions are satisfied for a large number of unadjusted samplers. We also conduct a numerical study showing that the method is super-efficient.
|2020.12.04 17:00 – 17:10||Welcome and Opening Speech||Andre Milzarek||CUHK-Shenzhen, ZOOM|
|2019.12.04 17:10 – 17:55||Spectral Gap of Slice Sampling||Daniel Rudolf||CUHK-Shenzhen, ZOOM|
|2020.12.04 17:55 – 18:25||Elliptical Slice Sampler and its Convergence||Viacheslav Natarovskii||CUHK-Shenzhen, ZOOM|
|2020.12.04 18:25 – 19:00||Snack Break||CUHK-Shenzhen, ZOOM|
|2020.12.04 19:00 – 19:45||Minimax Optimization in Machine Learning||Ernest Ryu||CUHK-Shenzhen, ZOOM|
|2020.12.04 19:45 – 20:30||Noise-level Robust Monte Carlo Methods for Bayesian Inference with Informative Data||Björn Sprungk||CUHK-Shenzhen, ZOOM|
|2020.12.05 09:15 – 10:00||Exponential Rates in Community Detection and Reinforcement Learning||Yingjie (Tom) Fei||CUHK-Shenzhen, ZOOM|
|2020.12.05 10:00 – 10:45||Learning Prediction Intervals for Regression: Generalization and Calibration||Huajie Qian||CUHK-Shenzhen, ZOOM|
|2020.12.05 10:45 – 11:15||Coffee Break||CUHK-Shenzhen, ZOOM|
|2020.12.05 11:15 – 12:00||Sample and Computationally Efficient Simulation Metamodeling in High Dimensions||Xiaowei Zhang||CUHK-Shenzhen, ZOOM|
|2020.12.05 12:00 – 14:00||Lunch||Pandora, Student Center,CUHK-Shenzhen|
|2020.12.05 14:00 – 14:45||Stability Analysis of Newton-MR under Hessian Perturbations||Yang Liu||CUHK-Shenzhen, ZOOM|
|2020.12.05 14:45 – 15:30||Non-Convex Orthogonal Group Synchronization via the Generalized Power Method||Anthony Man-Cho So||CUHK-Shenzhen, ZOOM|
|2020.12.05 15:30 – 16:00||Coffee Break and Group Photo||CUHK-Shenzhen, ZOOM|
|2019.12.05 16:00 – 16:45||Quasi-Stationary Monte Carlo Methods via Stochastic Approximation||Andi Wang||CUHK-Shenzhen, ZOOM|
|2020.12.06 09:15 – 10:00||Functional Inequalities of Infinite Swapping Algorithm: Theory and Application||Wenpin Tang||CUHK-Shenzhen, ZOOM|
|2020.12.06 10:00 – 10:45||Reproducing Stein Kernel Approach for Post-Hoc Correction of Approximate Sampling Algorithms||Fred Roosta||CUHK-Shenzhen, ZOOM|
|2020.12.06 10:45 – 18:00||Individual Meetings||CUHK-Shenzhen, ZOOM|