Accelerating Convergence of Large-scale Optimization Algorithms

Detta är en avhandling från Stockholm : KTH Royal Institute of Technology

Sammanfattning: Several recent engineering applications in multi-agent systems, communication networks, and machine learning deal with decision problems that can be formulated as optimization problems. For many of these problems, new constraints limit the usefulness of traditional optimization algorithms. In some cases, the problem size is much larger than what can be conveniently dealt with using standard solvers. In other cases, the problems have to be solved in a distributed manner by several decision-makers with limited computational and communication resources. By exploiting problem structure, however, it is possible to design computationally efficient algorithms that satisfy the implementation requirements of these emerging applications.In this thesis, we study a variety of techniques for improving the convergence times of optimization algorithms for large-scale systems. In the first part of the thesis, we focus on multi-step first-order methods. These methods add memory to the classical gradient method and account for past iterates when computing the next one. The result is a computationally lightweight acceleration technique that can yield significant improvements over gradient descent. In particular, we focus on the Heavy-ball method introduced by Polyak. Previous studies have quantified the performance improvements over the gradient through a local convergence analysis of twice continuously differentiable objective functions. However, the convergence properties of the method on more general convex cost functions has not been known. The first contribution of this thesis is a global convergence analysis of the Heavy- ball method for a variety of convex problems whose objective functions are strongly convex and have Lipschitz continuous gradient. The second contribution is to tailor the Heavy- ball method to network optimization problems. In such problems, a collection of decision- makers collaborate to find the decision vector that minimizes the total system cost. We derive the optimal step-sizes for the Heavy-ball method in this scenario, and show how the optimal convergence times depend on the individual cost functions and the structure of the underlying interaction graph. We present three engineering applications where our algorithm significantly outperform the tailor-made state-of-the-art algorithms.In the second part of the thesis, we consider the Alternating Direction Method of Multipliers (ADMM), an alternative powerful method for solving structured optimization problems. The method has recently attracted a large interest from several engineering communities. Despite its popularity, its optimal parameters have been unknown. The third contribution of this thesis is to derive optimal parameters for the ADMM algorithm when applied to quadratic programming problems. Our derivations quantify how the Hessian of the cost functions and constraint matrices affect the convergence times. By exploiting this information, we develop a preconditioning technique that allows to accelerate the performance even further. Numerical studies of model-predictive control problems illustrate significant performance benefits of a well-tuned ADMM algorithm. The fourth and final contribution of the thesis is to extend our results on optimal scaling and parameter tuning of the ADMM method to a distributed setting. We derive optimal algorithm parameters and suggest heuristic methods that can be executed by individual agents using local information. The resulting algorithm is applied to distributed averaging problem and shown to yield substantial performance improvements over the state-of-the-art algorithms. 

  KLICKA HÄR FÖR ATT SE AVHANDLINGEN I FULLTEXT. (PDF-format)