Séminaire de Probabilités et Statistique
jeudi 01 décembre 2022 à 13:00 - Visio
Camille Castera ()
A second-order inertial algorithm for non-smooth non-convex large-scale optimization
Non-convex non-smooth optimization has gained a lot of interest due to the efficiency of neural networks in many practical applications and the need to "train" them. Training amounts to solving very large-scale optimization problems. In this context, standard algorithms almost exclusively rely on inexact (sub-)gradients through automatic differentiation and mini-batch sub-sampling. As a result, first-order methods (SGD, ADAM, etc.) remain the most used ones to train neural networks. Driven by a dynamical system approach, we build INNA, an inertial and Newtonian algorithm, exploiting second-order information on the function only by means of first-order automatic differentiation and mini-batch sub-sampling. By analyzing together the dynamical system and INNA, we prove the almost-sure convergence of the algorithm to the critical points of the objective function. We also show that despite its second-order nature, INNA is likely to avoid strict-saddle points (formally, the limit is a local minimum with overwhelming probability). Practical considerations will be discussed, and some deep learning experiments will be presented. This is joint work with Jérôme Bolte, Cédric Févotte and Édouard Pauwels. Lien zoom : https://umontpellier-fr.zoom.us/j/94087408185