Skip to content

Details

This is an hands-on workshop for total beginners in Natural Language Processing who are already proficient with python.

Potential audience: Whoever is interested in experimenting with this fascinating domain of NLP, whether you're a pro data scientist with no experience in NLP or not a dev guy (PM, bizdev, ...) but can code in Python.

Please register here: http://yarok-hok.com/events/nlp-workshop/

We would cover text classification in python, using sklearn, spacy, nltk and pandas, via a challenge (and competition) over real data.

This event is hands-on, All attendees MUST bring:

  1. A laptop with python 3.6 installed (preferably anaconda for python3)
  2. Download Data: http://goren.ml/pdnlp
  3. Clone repository: https://github.com/urigoren/nlp_classification

Agenda (Short):

16:30-17:00 - Gathering
17:00 - 18:30 - Natural language processing introduction (3 short lectures)
18:30 Hands-on workshop & Competition Start
21:00 Competition End, 'and the winner is...'

Agenda (Long):

16:30-17:00 - Gathering
17:00 - 17:15 - Natural language processing intro

  • A brief overview of NLP tasks
  • Supervised tasks (Named entity recognition, Sentiment analysis, classification)
  • Unsuperised tasks (Text generation, machine translation, topic modeling)
  • Why is NLP harder in Hebrew
  • overview of the data set
    17:15 - 18:00 From Textual documents to vectors
  • Preprocessing (String manipulations and Defining word boundaries and tokens)
  • Word stems / lemmas
  • Identifying phrases
  • Generating custom vocabularies
  • Transforming a document to a vector
  • One hot word encodings
  • Word Vectors (word2vec, glove)
  • Combining word vectors in document vectors with the Bag Of Words assumption
    18:00 - 18:30 Modelling in depth
  • Logistic regression for document classification
  • Naive Bayes modeling
  • Model evaluation
  • Training and testing
  • Metrics
    18:30 Hands-on workshop and Competition Start
    20:30 Competition End

Related topics

You may also like