This month we have Serge Belongie visiting from UCSD presenting "Visual Recognition with Humans in the Loop". Serge is in NYC because he's teaching Modern Analytics (CS 5785) at the Cornell NYC Tech (http://now.cornell.edu/nyctech/) campus in the spring. Here's the abstract and Serge's bio.
We present an interactive, hybrid human-computer method for object classification. The method applies to classes of problems that are difficult for most people, but are recognizable by people with the appropriate expertise (e.g., animal species or airplane model recognition). The classification method can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. Incorporating user input drives up recognition accuracy to levels that are good enough for practical applications; at the same time, computer vision reduces the amount of human interaction required. The resulting hybrid system is able to handle difficult, large multi-class problems with tightly-related categories. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate the accuracy and computational properties of different computer vision algorithms and the effects of noisy user responses on a dataset of 200 bird species and on the Animals With Attributes dataset. Our results demonstrate the effectiveness and practicality of the hybrid human-computer classification paradigm.
Fig. 1 Illustration of the Visual 20 Questions process, in which automated algorithms determine category probabilities for the input image, after which the system automatically selects questions for the user based on a maximum expected information gain criterion.
Related Papers (http://vision.ucsd.edu/project/visipedia)
Serge Belongie was born in Sacramento, California. He received the B.S. degree (with honor) in Electrical Engineering from the California Institute of Technology in 1995 and the M.S. and Ph.D. degrees in Electrical Engineering and Computer Sciences (EECS) at U.C. Berkeley in 1997 and 2000, respectively. While at Berkeley, his research was supported by a National Science Foundation Graduate Research Fellowship. He is also a co-founder of Digital Persona, Inc., and the principal architect of the Digital Persona fingerprint recognition algorithm. He is currently a Professor in the Computer Science and Engineering Department at U.C. San Diego. His research interests include computer vision and pattern recognition. He is a recipient of the NSF CAREER Award and the Alfred P. Sloan Research Fellowship. In 2004 MIT Technology Review named him to the list of the 100 top young technology innovators in the world (TR100).