00:00:00 / 00:00:00
5 6

This talk addresses the problem of understanding the visual content of images and videos using weak forms of supervision, such as the fact that multiple images contain instances of the same objects, or the textual information available in television or film scripts. I will discuss several instances of this problem, including image cosegmentation, the joint localization and identification of movie characters and their actions, and the assignment of action labels to video frames using temporal ordering constraints. I will present the underlying discriminative clustering model, appropriate relaxations of the combinatorial optimization problems associated with learning its parameters, and efficient algorithms for solving the corresponding convex optimization problems. I will also present experimental results on standard image benchmarks and feature-length films. I will conclude with a brief discussion of our recent work on fully unsupervised object discovery in photographs and videos.

Information about the video

  • Date of recording 24/04/2017
  • Date of publication 27/04/2017
  • Institution IHES
  • Format MP4

Domain(s)

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback