Using Category-Level Templates in Shape-from-Template
Type: Engineering school or master degree
An active area in the research fields of computer vision and Augmented Reality (AR) is to interpret the 3D world from camera images of videos. One of the current open challenges is how to do this when the scene is dynamic and non-rigid. There are two main approaches to this problem, both of which are being pioneered at our lab in ALCoV. This first is by assuming we have a 3D model of the objects in the scene, and the goal is to determine their nonrigid shapes using cues from the images. This is also called Shape-from-Template in the research literature (see our recent publications at http://flexable.u-clermont1.fr). A second approach is to solve the problem using only cues from the images, such as motion (which is known as non-rigid structure-from-motion). This is considerably harder and open.
In this project the goal is to develop a Shape-from-Template method, but using a generic template from a library of Computer Assisted Design (CAD) models. For example, imagine that we have a generic 3D template of an animal such as a cat, and our goal is to obtain the shape of a specific cat from a video from youtube. The task involves adapting the CAD model to the shape of the cat using information from the video such as pixels, features and contours. There has already been some work on solving this type of problem with a generic template. An excellent paper you can find (including videos) is called “What shape are dolphins: Building 3D morphable models from 2D images”: http://www.cantab.net/users/tom.cashman. Although this work is impressive, there are some limitations. The goal of this project is to build on top of this work, using the provided code as a base platform, and to overcome one or more of its limitations. There are three limitations that can be tackled. 1) This method uses what is known as a linear shape bases model. This only works when the object changes shape in simple ways (such as the dolphin’s tail moving up and down), and does not work well for other objects such as animals with articulated limbs. You will use a more expressive model that can handle more complex deformation such as the movement of an animal’s limbs. The model you will use is known as the ‘as-rigid-as-possible’ model (see http://www.igl.ethz.ch/projects/ARAP/arap_web.pdffor an example). 2) This method only uses the silhouette contour of the object. However there is also texture information, from which it is possible to track features on the object’s surface and combine these with silhouette contours to obtain better shape estimates. 3) The method uses contours that are manually computed from the images. A better approach is to eliminate this manual stage, and have the model fit directly to strong edges detected in the images. We will show you how to do this robustly, and this will mean a considerably more automatic process.
This project is an excellent opportunity for working on an open and interesting research problem in computer vision, and do this in a world-leading group. Because you will work on top of an existing code base, you will be able to start making progress very early on, and it is likely a publication can result from the internship.