Asynchronous Parallel Algorithms for Big-Data Nonconvex Optimization

10.25394/PGS.8846183.v1 Loris Cannelli Loris Cannelli Asynchronous Parallel Algorithms for Big-Data Nonconvex Optimization Purdue University Graduate School 2019 Asynchronous Algorithms nonconvex optimization problems Machine Learning Linear Rate Automation and Control Engineering 2019-08-13 15:47:55 Thesis https://hammer.purdue.edu/articles/thesis/Asynchronous_Parallel_Algorithms_for_Big-Data_Nonconvex_Optimization/8846183 <div>The focus of this Dissertation is to provide a unified and efficient solution method for an important class of nonconvex, nonsmooth, constrained optimization problems. Specifically, we are interested in problems where the objective function can be written as the sum of a smooth, nonconvex term, plus a convex, but possibly nonsmooth, regularizer. It is also considered the presence of nonconvex constraints. This kind of structure arises in many large-scale applications, as diverse as information processing, genomics, machine learning, or imaging reconstruction.</div><div> </div><div> We design the first parallel, asynchronous, algorithmic framework with convergence guarantees to stationary points of the class of problems under exam. The method we propose is based on Successive Convex Approximation techniques; it can be implemented with both fixed and diminishing stepsizes; and enjoys sublinear convergence rate in the general nonconvex case, and linear convergence case under strong convexity or under less stringent standard error bound conditions. The algorithmic framework we propose is very abstract and general and can be applied to different computing architectures (e.g., message-passing systems, cluster of computers, shared-memory environments), always converging under the same set of assumptions. </div><div> </div><div> In the last Chapter we consider the case of distributed multi-agent systems. Indeed, in many practical applications the objective function has a favorable separable structure. In this case, we generalize our framework to take into consideration the presence of different agents, where each one of them knows only a portion of the overall function, which they want cooperatively to minimize. The result is the first fully decentralized asynchronous method for the setting described above. The proposed method achieve sublinear convergence rate in the general case, and linear convergence rate under standard error bound conditions.</div><div> </div><div>Extensive simulation results on problems of practical interest (MRI reconstruction, LASSO, matrix completion) show that the proposed methods compare favorably to state-of-the art-schemes.</div>