- Dec 8th, 2014
- The same as the main conference
- Registration fee:
- Included in the main conference registration
Lecturer: Yaser Sheikh and Tomas Simon (CMU)
The geometric relationship between 2D image projections and underlying 3D scenes has been studied over several centuries in the fields of optics, photogrammetry, and, most recently, computer vision. This entire body of work is based on the assumption that the scene is stationary between the capture of images. Yet, most imagery being generated today is from monocular cameras viewing dynamic scenes (such as social interactions). The fundamental constraint of 3D reconstruction---3D triangulation of 2D correspondences---and the theoretical edifice built upon it does not apply to such imagery where cameras and scenes are dynamic.
This tutorial will focus on the development of the computational foundation of the multi-view geometry of moving cameras and the definition of multilinear operators for dynamic scenes that relate image measurements across moving cameras. During the tutorial, we will include a survey of the literature on dynamic 3D reconstruction, while discussing important open questions for future research.
This development is qualitatively different from the classical structure-from-motion problem in that it is fundamentally ill-posed (as you often only get one view of a particular point at a particular time instant). As a result, the two major thrusts of this line of research are: (1) the design of statistical dynamical models to condition the estimation, and (2) the characterization of degenerate cases that abound in these models.
|Introduction: Multiple view geometry and dynamic scenes
|Trajectory and spatiotemporal models
|Optimization and future directions
Lecturer: Diego Thomas (NII)
The generation of fine 3D models from RGB-D (color plus depth) measurements is of great interest for the computer vision community. Although the 3D reconstruction pipeline has been widely studied in the last decades, a new era has started recently with the advent of low cost consumer depth cameras (called RGB-D cameras) that capture RGB-D images at a video rate (e.g., Microsoft Kinect or Asus Xtion Pro). The introduction to the public of 3D measurements has brought its own revolution to the scientific community with many projects and applications using RGB-D cameras.
In this tutorial, we will give an overview of the existing 3D reconstruction methods using a single RGB-D camera using various 3D representations, including point based representations (SURFELS), implicit volumetric representations (TSDF), patch based representations and parametric representations. These different 3D scene representations give us powerful tools to build virtual representations of the real world in real-time from RGB-D cameras. We can not only reconstruct small-scale static scenes but also large-scale scenes and dynamic scenes. We will also discuss about current trend in depth sensing and future challenges for 3D scene reconstruction.
|Putting everything together