Long Short-Term Memory


Details
Welcome to the DC/NoVA Papers We Love meetup!
Papers We Love is an international organization centered around the appreciation of computer science research papers. There's so much we can learn from the landmark research that shaped the field and the current studies that are shaping our future. Our goal is to create a community of tech professionals passionate about learning and sharing knowledge. Come join us!
New to research papers? Watch The Refreshingly Rewarding Realm of Research Papers (https://www.youtube.com/watch?v=8eRx5Wo3xYA) by Sean Cribbs.
Ideas and suggestions are welcome–fill our our interest survey here (https://docs.google.com/forms/d/e/1FAIpQLSeJwLQhnmzWcuyodPrSmqHgqrvNxRbnNSbiWAuwzHwshhy_Sg/viewform) and let us know what motivates you!
// Tentative Schedule
• 7:00-7:15–Informal paper discussion
• 7:15-7:25–Introduction and announcements
• 7:25-8:40–Long Short-Term Memory
• 8:40-9:00–Informal paper discussion
// Directions
CustomInk Cafe (3rd Floor)
Mosaic District, 2910 District Ave #300
Fairfax, VA 22031
When you get here you can come in via the patio. Don't be scared by the metal gate and sign. It's accessible via the outside stairs near True Food. There is a parking garage next door for those coming by vehicle. And, there is a walkway to the patio on the 3rd floor of the garage nearest moms organic market.
Metro: The Dunn Loring metro station is about 0.7 miles from our meetup location. It’s very walkable, but if you’d prefer a bus, the 402 Southbound and 1A/1B/1C Westbound leave from Dunn Loring Station about every 5-10 minutes (see a schedule for more detailed timetable).
If you're late, we totally understand–please still come! (via the patio is best) Just be sure to slip in quietly if a speaker is presenting.
// Papers
This will be the second of my series of presentations on modern applications/theory of neural nets! We'll be focused on the first LSTMs ("Long Short Term Memory" neural nets) paper this time around: http://www.bioinf.jku.at/publications/older/2604.pdf .
Last time ( https://www.meetup.com/Papers-We-Love-DC-NoVA/events/242020472/ ) we looked at embedding a form of rotational invariance directly into the parameter structure of image-specialized convolutional neural nets, and took a quick look at some Python (based originally on this not-at-all-mine blog post: http://parneetk.github.io/blog/cnn-cifar10/ ) for training a CNN on an old but still informative data set of 50,000 or so images (CIFAR-10: https://www.cs.toronto.edu/~kriz/cifar.html) using Keras/Theano in a rented Amazon Web Services ec2 node.
This time we're switching application types and will be discussing the foundational paper on LSTMs (http://www.bioinf.jku.at/publications/older/2604.pdf). Where CNNs are specialized for patterns important in image classification, LSTMs are specialized to exploit structure in multivariate time series, and to overcome a well-known problem in training deep nets, the "vanishing gradients" problem. Prep time pending, hopefully we (and I!) will learn a little more Keras/Theano/Tensorflow, too, and be able to look at another example of actually training an LSTM net!
For anyone interested in digging into the code we'll likely use a bit more, here's a free Keras blog-torial on building an LSTM: https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/ . Most likely I'll start there and build an example we can look together on an Amazon Web Service ec2 node with the Amazon machine "Image Deep Learning AMI (Amazon Linux) Version 1.0 - ami-895adef3" and the p2.xlarge "GPU compute" instance type setting (so a GPU will be available; cost for this sort of node is about $0.90 / hour if you want to do the same on your own). Happy to help folks get their own ec2 setup going with Keras if you'd like to dig your hands into some of the code before the discussion - just reply here or message me!

Long Short-Term Memory