Imaging and machine learning

Collection Imaging and machine learning

Organizer(s)

Date(s) 03/05/2024

00:00:00 / 00:00:00

23 30

Towards demystifying over-parameterization in deep learning

By Mahdi Soltanolkotabi

Many modern learning models including deep neural networks are trained in an over-parameterized regime where the parameters of the model exceed the size of the training dataset. Training these models involve highly non-convex landscapes and it is not clear how methods such as (stochastic) gradient descent provably find globally optimal models. Furthermore, due to their over-parameterized nature these neural networks in principle have the capacity to (over)fit any set of labels including pure noise. Despite this high fitting capacity, somewhat paradoxically, neural networks models trained via first-order methods continue to predict well on yet unseen test data. In this talk I will discuss some results aimed at demystifying such phenomena in deep learning and other domains such as matrix factorization by demonstrating that gradient methods enjoy a few intriguing properties: (1) when initialized at random the iterates converge at a geometric rate to a global optima, (2) among all global optima of the loss the iterates converge to one with a near minimal distance to the initial estimate and do so by taking a nearly direct route, (3) are provably robust to noise/corruption/shuffling on a fraction of the labels with these algorithms only fitting to the correct labels and ignoring the corrupted labels. (This talk is based on joint work with Samet Oymak)

Information about the video

Date of recording 04/04/2019
Date of publication 10/05/2019
Institution IHP
Language English
Format MP4
Venue Institut Henri Poincaré

Domain(s)

Computer Science

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow

All the collection videos

41:31

published on May 7, 2019

Structured prediction via implicit embeddings

By Alessandro Rudi

55:19

published on May 7, 2019

A Kernel Perspective for Regularizing Deep Neural Networks

By Julien Mairal

45:58

published on May 7, 2019

Random Matrix Advances in Machine Learning

By Romain Couillet

51:54

published on May 7, 2019

Optimization meets machine learning for neuroimaging

By Alexandre Gramfort

45:48

published on May 7, 2019

Iterative regularization via dual diagonal descent

By Silvia Villa

38:49

published on May 7, 2019

Scalable hyperparameter transfer learning

By Valerio Perrone

45:58

published on May 7, 2019

Using structure to select features in high dimension

By Chloé-Agathe Azencott

44:30

published on May 7, 2019

Predicting aesthetic appreciation of images

By Naila Murray

45:57

published on May 7, 2019

Learning Representations for Information Obfuscation and Inference

By Guillermo Sapiro

48:01

published on May 7, 2019

An SDCA-powered inexact dual augmented Lagrangian method for fast CRF learning

By Guillaume Obozinski

30:02

published on May 7, 2019

Revisiting non-linear PCA with progressively grown autoencoders

By José Lezama

51:22

published on May 9, 2019

Combinatorial Solutions to Elastic Shape Matching

By Daniel Cremers

47:33

published on May 7, 2019

On the several ways to regularize optimal transport

By Marco Cuturi

42:20

published on May 9, 2019

Rank optimality for the Burer-Monteiro factorization

By Irène Waldspurger

50:06

published on May 9, 2019

Bayesian inversion for tomography through machine learning

By Ozan Öktem

48:30

published on May 9, 2019

Understanding geometric attributes with autoencoders

By Alasdair Newson

50:00

published on May 9, 2019

Statistical inference in high-dimension and application to medical imaging

By Bertrand Thirion

46:59

published on May 9, 2019

Deep Inversion, Autoencoders for Learned Regularization of Inverse Problems

By Christoph Brune

44:42

published on May 9, 2019

Optimal machine learning with stochastic projections and regularization

By Lorenzo Rosasco

51:43

published on May 10, 2019

Roto-Translation Covariant Convolutional Networks for Medical Image Analysis

By Remco Duits

52:34

published on May 10, 2019

Unsupervised domain adaptation with application to urban scene analysis

By Patrick Pérez

50:11

published on May 10, 2019

Designing multimodal deep architectures for Visual Question Answering

By Matthieu Cord

43:47

published on May 10, 2019

Towards demystifying over-parameterization in deep learning

By Mahdi Soltanolkotabi

41:40

published on May 10, 2019

Nonnegative matrix factorisation with the beta-divergence for robust hyperspectral unmixing

By Cédric Févotte

55:15

published on May 10, 2019

Autoencoder Image Generation with Multiscale Sparse Deconvolutions

By Stéphane Mallat

48:01

published on May 10, 2019

Learning from permutations

By Jean-Philippe Vert

43:31

published on May 10, 2019

Learned image reconstruction for high-resolution tomographic imaging

By Marta Betcke

36:16

published on May 10, 2019

Contextual Bandit: from Theory to Applications

By Claire Vernade

01:00:52

published on May 10, 2019

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

By Francis Bach

01:12:01

published on May 9, 2019

L’intelligence Artificielle est-elle Logique ou Géométrique ?

By Stéphane Mallat

Copyright Carmin.tv 2024

Give feedback