Building Image-Classification Models that are Accurate and Efficient


Details
So happy to have Laurens van der Maaten from Facebook AI Research give the next talk. Laurens is doing amazing work, so come listen!
Abstract: Convolutional networks constitute the core of state-of-the-art approach to a range of problems in computer vision. Typical networks comprise of tens or even hundreds of layers of convolutions using learned filters, which require a lot of computational and memory resources. In this talk, I will present two new convolutional-network architectures that are substantially more efficient than state-of-the-art residual networks (ResNets), whilst maintaining high predictive accuracy. The first architecture, called DenseNet, connects each layer to every other layer in a feed-forward fashion. For each layer, the feature maps of all preceding layers are used as inputs, and its own feature maps are used as inputs into all subsequent layers. Surprisingly, this substantially reduces the number of parameters needed for the network to perform well, and makes learning easier. The second architecture, called multi-scale DenseNets (MSDNets), extends DenseNets to maintain multi-scale image representations, which allows for the training of multiple classifiers at intermediate layers of the network. This allows us to train a single MSDNet that, at prediction time, dynamically decides the size of the network: for "easy" images, only a small part of the network is evaluated, whilst for "difficult" images, we evaluate the full, high-quality network. MSDNets achieve state-of-the-art performances on image-classification benchmarks with much lower computational requirements.
This talk presents joint work with Gao Huang, Danlu Chen, and Kilian Weinberger of Cornell University.
Bio: Laurens van der Maaten is a research scientist at Facebook AI Research in New York. Prior, he worked as an Assistant Professor at Delft University of Technology (The Netherlands) and as a post-doctoral researcher at University of California, San Diego. He received his PhD from Tilburg University in 2009. He is an editorial board member of IEEE Transactions of Pattern Analysis and Machine Intelligence and is serving as an area chair for the NIPS and ICML conferences. Laurens is interested in a variety of topics in machine learning and computer vision. Specific research topics include learning embeddings for visualization and deep learning, time series classification, regularization, object tracking, and cost-sensitive learning.
Here is a link to his publications: https://lvdmaaten.github.io/publications/
He's done wonderful work recently on Visual Question and Answering (https://arxiv.org/abs/1606.08390) and visual reasoning ( http://cs.stanford.edu/people/jcjohns/clevr/ ) and in the past on T-Sne and many other topics. You can find some of his work on Arxiv as well: https://arxiv.org/find/cs/1/au:+Maaten_L/0/1/0/all/0/1

Building Image-Classification Models that are Accurate and Efficient