Word Embedding: from theory to practice


Details
It is my pleasure to announce that we will have the next Stockholm NLP Meetup :-)
This meetup will be about "Word Embedding". The speakers will present a good mix of theory including touch of its history and existing methods, specific focus on word2vec, application of word embedding, and a lot of code practices. There will be online streaming. The event will be hosted by Gavagai.
Schedule:
• 18:00 - 18:20 Mingling
• 18:20 - 18:30 Welcoming, introduction
• 18:30 - 19:00 "Distributional semantics at Gavagai" by Magnus Sahlgren
• 19:00 - 19:10 Q&A
• 19:10 - 19:25 Break
• 19:25 - 19:45 "Word2vec from theory to practice" by Hendrik Heuer
• 19:45 - 19:50 Q&A
• 19:50 - 20:05 "Distributional semantic models in practice" by Jimmy Callin
• 20:05 - 20:10 Q&A
• 20:10 - 21:00 Mingling
Talk I:
Magnus Sahlgren will present a talk on "Distributional Semantics at Gavagai". Magnus is chief scientist and co-founder of Gavagai. His specialty is on distributional semantics and word space model.
This talk gives a brief introduction to, and overview over the history of, distributional semantics and word embedding. Magnus will briefly cover the models like LSA, HAL, LDA, Random Indexing, word2vec and GloVe. Moreover, he will go through some current research at Gavagai, and see how we use distributional semantic models in our production systems.
Talk II:
Hendrik Heuer will present a talk on “Word2Vec theory to practice”. Hendrik is a master student at KTH. His master thesis has been focused on the word embedding and specifically word2vec. In this presentation Hendrik will share with us his experience and knowledge. The presentation will provide an overview on how word2vec is implemented and how to use it in Python with gensim. word2vec is a deep-learning tool designed to understand the relationships between words that was released by Google. When words are represented as points in space, the spatial distance between words describes a similarity between these words. Hendrik will discuss how to use this in practice and how to visualize the results (using t-SNE). Moreover, he will go through the Python library spaCy for dependency-based word embedding.
Talk III:
Jimmy Callin will present "Distributional semantic models in practic". Jimmy has Language Technology background and works at Gavagai. He has published distributional semantic related libraries in github. In this short demo, Jimmy will present PyDSM - a small Python library for experimenting with distributional semantic models. Jimmy will go through different models and weighting functions to see how this affects the overall quality of the word representations, and perhaps take a closer look at a small case study.

Word Embedding: from theory to practice