Imaging and machine learning

Collection Imaging and machine learning

Organisateur(s)

Date(s) 14/05/2024

00:00:00 / 00:00:00

22 30

Designing multimodal deep architectures for Visual Question Answering

De Matthieu Cord

Multimodal representation learning for text and image has been extensively studied in recent years. Currently, one of the most popular tasks in this field is Visual Question Answering (VQA). I will introduce this complex multimodal task, which aims at answering a question about an image. To solve this problem, visual and textual deep nets models are required and, high level interactions between these two modalities have to be carefully designed into the model in order to provide the right answer. This projection from the unimodal spaces to a multimodal one is supposed to extract and model the relevant correlations between the two spaces. Besides, the model must have the ability to understand the full scene, focus its attention on the relevant visual regions and discard the useless information regarding the question.

Informations sur la vidéo

Date de captation 04/04/2019
Date de publication 10/05/2019
Institut IHP
Langue Anglais
Format MP4
Lieu Institut Henri Poincaré

Domaine(s)

Informatique

Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

Poser une question sur MathOverflow

Toutes les vidéos de la collection

41:31

publiée le 7 mai 2019

Structured prediction via implicit embeddings

De Alessandro Rudi

55:19

publiée le 7 mai 2019

A Kernel Perspective for Regularizing Deep Neural Networks

De Julien Mairal

45:58

publiée le 7 mai 2019

Random Matrix Advances in Machine Learning

De Romain Couillet

51:54

publiée le 7 mai 2019

Optimization meets machine learning for neuroimaging

De Alexandre Gramfort

45:48

publiée le 7 mai 2019

Iterative regularization via dual diagonal descent

De Silvia Villa

38:49

publiée le 7 mai 2019

Scalable hyperparameter transfer learning

De Valerio Perrone

45:58

publiée le 7 mai 2019

Using structure to select features in high dimension

De Chloé-Agathe Azencott

44:30

publiée le 7 mai 2019

Predicting aesthetic appreciation of images

De Naila Murray

45:57

publiée le 7 mai 2019

Learning Representations for Information Obfuscation and Inference

De Guillermo Sapiro

48:01

publiée le 7 mai 2019

An SDCA-powered inexact dual augmented Lagrangian method for fast CRF learning

De Guillaume Obozinski

30:02

publiée le 7 mai 2019

Revisiting non-linear PCA with progressively grown autoencoders

De José Lezama

51:22

publiée le 9 mai 2019

Combinatorial Solutions to Elastic Shape Matching

De Daniel Cremers

47:33

publiée le 7 mai 2019

On the several ways to regularize optimal transport

De Marco Cuturi

42:20

publiée le 9 mai 2019

Rank optimality for the Burer-Monteiro factorization

De Irène Waldspurger

50:06

publiée le 9 mai 2019

Bayesian inversion for tomography through machine learning

De Ozan Öktem

48:30

publiée le 9 mai 2019

Understanding geometric attributes with autoencoders

De Alasdair Newson

50:00

publiée le 9 mai 2019

Statistical inference in high-dimension and application to medical imaging

De Bertrand Thirion

46:59

publiée le 9 mai 2019

Deep Inversion, Autoencoders for Learned Regularization of Inverse Problems

De Christoph Brune

44:42

publiée le 9 mai 2019

Optimal machine learning with stochastic projections and regularization

De Lorenzo Rosasco

51:43

publiée le 10 mai 2019

Roto-Translation Covariant Convolutional Networks for Medical Image Analysis

De Remco Duits

52:34

publiée le 10 mai 2019

Unsupervised domain adaptation with application to urban scene analysis

De Patrick Pérez

50:11

publiée le 10 mai 2019

Designing multimodal deep architectures for Visual Question Answering

De Matthieu Cord

43:47

publiée le 10 mai 2019

Towards demystifying over-parameterization in deep learning

De Mahdi Soltanolkotabi

41:40

publiée le 10 mai 2019

Nonnegative matrix factorisation with the beta-divergence for robust hyperspectral unmixing

De Cédric Févotte

55:15

publiée le 10 mai 2019

Autoencoder Image Generation with Multiscale Sparse Deconvolutions

De Stéphane Mallat

48:01

publiée le 10 mai 2019

Learning from permutations

De Jean-Philippe Vert

43:31

publiée le 10 mai 2019

Learned image reconstruction for high-resolution tomographic imaging

De Marta Betcke

36:16

publiée le 10 mai 2019

Contextual Bandit: from Theory to Applications

De Claire Vernade

01:00:52

publiée le 10 mai 2019

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

De Francis Bach

01:12:01

publiée le 9 mai 2019

L’intelligence Artificielle est-elle Logique ou Géométrique ?

De Stéphane Mallat

Copyright Carmin.tv 2024

Donner son avis