Recurrent Neural Networks with Attention and why they are so hard to train
Dear Data Science Journal readers!
Please read the materials before the meetup. If you don't get a chance, feel free to come join the discussion anyway
Motivation: "Teaching Machines to Read and Comprehend" (http://arxiv.org/abs/1506.03340) by Deepmind.
A math and code treatment of attention in NN - looking for advice in the comments.
One idea: Recurrent Models of Visual Attention (http://gitxiv.com/posts/ZEobCXSh23DE8a8mo/recurrent-models-of-visual-attention)
- Pascanu "On the difficulty of training Recurrent Neural Networks (http://www.jmlr.org/proceedings/papers/v28/pascanu13.pdf)"
Introductory blog posts, theses and ipython notebooks are very welcome in the comments.
Message to those on the waiting list: 20 more places will be released 2 days before the event.