10e Journée Statistique et Informatique pour la Science des Données à Paris-Saclay

Collection 10e Journée Statistique et Informatique pour la Science des Données à Paris-Saclay

Organizer(s) Evgenii Chzhen, Erwan Le Pennec
Date(s) 01/04/2025 - 01/04/2025
linked URL https://indico.math.cnrs.fr/event/14016/
00:00:00 / 00:00:00
6 6

Training Overparametrized Neural Networks: Early Alignment Phenomenon and Simplicity Bias

By Etienne Boursier

The training of neural networks with first order methods still remains misunderstood in theory, despite compelling empirical evidence. Not only it is believed that neural networks converge towards global minimizers, but the implicit bias of optimisation algorithms makes them converge towards specific minimisers with nice generalisation properties. This talk focuses on the early alignment phase that appears in the training dynamics of two layer networks with small initialisations. During this early alignment phase, the numerous neurons align towards a few number of key directions, hence leading to some sparsity in the number of represented neurons. While this alignment phenomenon can be at the origin of convergence towards spurious local minima of the network parameters, such local minima can actually have good properties and yield much lower excess risks than any global minimizer of the training loss. In other words, this early alignment can lead to a simplicity bias that is helpful in minimizing the test loss.

Information about the video

  • Date of recording 01/04/2025
  • Date of publication 10/04/2025
  • Institution IHES
  • Language English
  • Audience Researchers
  • Format MP4

Domain(s)

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback