Colloquium de Mathématiques
mardi 14 février 2023 à 14:00 - Salle 36.08 - Bâtiment 36
Edouard Pauwels (Institut de Recherche en Informatique de Toulouse, Université Paul Sabatier)
Non-smooth algorithmic differentiation
Algorithmic Differentiation (AD) allows to efficiently differentiate numerical programs, using automatized implementations of differential calculus rules. Interest in AD was renewed in the context of gradient based optimization for machine learning applications, resulting in widespread numerical libraries, (e.g. Tensorflow, Pytorch). I will first present an overview of the combination of AD with numerical optimization in the context of neural network training. Interestingly, AD is widely used in conjunction with non-smooth (i.e. non-differentiable) elementary functions, such as variants of the absolute value. Notions of generalized derivatives from non-smooth analysis do not comply with differential calculus and the combination of AD with nonsmooth objects produces artifacts without any variational interpretation. In other words, there is no mathematical model for central quantities used in non-smooth neural network training. Motivated by this question, we introduced a notion of conservativity for set valued fields. It accounts for the spurious behavior of AD while maintaining non trivial asymptotic optimization guaranties. These results rely on a differential inclusion approach to asymptotic analysis of stochastic gradient algorithms and a surprising connection between AD and a stratification property, which occurs for the vast majority of machine learning models. Joint work with Jérôme Bolte.