Gradient descent for wide two-layer neural networks | Vidéo | Carmin.tv

00:00:00 / 00:00:00

Gradient descent for wide two-layer neural networks

By Francis Bach

Appears in collection : Optimization for Machine Learning / Optimisation pour l’apprentissage automatique

Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. Towards understanding this phenomenon, we analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations. We show that the limits of the gradient flow on exponentially tailed losses can be fully characterized as a max-margin classifier in a certain non-Hilbertian space of functions.

Information about the video

Date of recording 12/03/2020
Date of publication 06/04/2020
Institution CIRM
Licence CC BY NC ND
Language English
Audience Researchers
Director(s) Guillaume Hennenfent
Format MP4

Citation data

DOI 10.24350/CIRM.V.19622703
Cite this video BACH, Francis (12/03/2020). Gradient descent for wide two-layer neural networks. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19622703
URL https://dx.doi.org/10.24350/CIRM.V.19622703

Domain(s)

MSC codes

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow

Copyright Carmin.tv 2025

Give feedback