Züri ML #32: Frontiers of Recurrent Neural Networks

Name: Züri ML #32: Frontiers of Recurrent Neural Networks
Start: 2017-03-30T18:30:00+02:00
End: 2017-03-30T21:30:00+02:00
Location: ETH Zurich, Main building (HG Building), E3

Hosted by Julian Z. and 2 others

Meet the group

Zurich Machine Learning and Data Science

No reviews yet

Details

Multi-dimensional LSTM Networks for Image Analysis

Wonmin Byeon, ETH Zürich

Abstract: Long Short-term Memory (LSTM) recurrent neural networks have initially been introduced for single dimensional sequence learning like handwriting and speech recognition. The extension, Multi-dimensional LSTM (MD-LSTM) networks accesses more than one dimension and allows to learn long-range contexts of multi-dimensional data such as images and videos. The networks learn directly from raw pixel values and take the complex spatial dependencies of each pixel into account.

In this talk, I will present two different MD-LSTM models for 2D and 3D data, 2D-LSTM and Pyramid-LSTM. 2D-LSTM networks are directly applied to scene images and show improvement over other state-of-the-art methods. Pyramid-LSTM, a variant of MD-LSTM rearranges the original architecture, resulting in easier parallelization on GPU and fewer computations overall. At the end, I will show pixel-wise image segmentation on 3D biomedical volumetric images as an application of Pyramid-LSTM.

Reference: https://arxiv.org/abs/1506.07452

Making RNNs deep per time-step - Recurrent Highway Networks

Julian Zilly, ETH Zürich

Training Recurrent Neural Networks (RNNs) has historically been a challenging task due to the vanishing and exploding gradient problem. Long Short-Term Memory (LSTM) networks provide a solution to this challenge. However, LSTMs have commonly only been “deep" across many time-steps but do not make use of multiple neural network layers per-time step.

To extend LSTMs to greater network depth per time-step, I will discuss Recurrent Highway Networks (RHNs) which use Highway layers to create recurrent networks with multiple neural network layers per time-step. I will then highlight the state-of-the-art performance and trade-offs involved in using RHNs for language modeling. Finally, future directions for deep RNNs are outlined.

Reference: https://arxiv.org/abs/1607.03474

Github: https://github.com/julian121266/RecurrentHighwayNetworks (https://github.com/julian121266/RecurrentHighwayNetworks)

Events in Zürich, CH

Züri ML #32: Frontiers of Recurrent Neural Networks

Zurich Machine Learning and Data Science

Details

Members are also interested in