Past Meetup

In-depth Machine Learning (Natural Language Processing)

This Meetup is past

81 people went


About the Meet-up:

Wow. We're more than half-way through 2017 and by popular demand the R Users and Machine Learning Johannesburg Meet-up is back!

This meet-up will be an in-depth focus on Machine Learning for Natural Language Processing. We'll make another announcement when the speaker slots are posted in the next 2 weeks.

You're not going to want to miss this one! Our previous meet-up was 60+ attendees RSVP so please RVSP early.

Note that the original date of the 9th needed to be changed due to National Womens day on the 9th.


Microsoft Campus in Bryanston


Primary Speakers Line Up (Announced 01 August!)

Nick Martin (ML @ Overscore AI)

An Introduction to Unstructured Text Retrieval

The majority of human knowledge is captured in unstructured text e.g. books, blogs, tweets, research papers etc. The task of retrieving relevant documents given some information need is non-trivial. This goal of this talk is start you on the path to building your own basic search engine.

Talking points:

An overview of Natural Language Processing and Information Retrieval

Basics of NLP: how do computers understand human language?

Basics of IR: how do machines order documents?

The data question: what do you have to feed your models?

State-of-the-art techniques (Machine Learning & Deep Learning)

If you have a text-only corpus of documents:

- Simple unstructured retrieval algorithms

- State-of-the-art techniques (Machine Learning & Deep Learning)

- Simple relevance feedback models

Xander Horn (Tracker Connect)

MLR Package Introduction for Models and NLP

Embarking on a machine learning project can be a daunting task, with data extraction, accurate problem statements, data cleaning, feature engineering, model training and evaluation there sure are a lot of things to keep in mind. Mlr addresses the last two topics, by offering a well-established machine learning environment on which models can be trained. In addition to that it offers simplicity in terms of:

You will learn to do the following:

- Train a wide variety of ML models on the data at hand using parallel computing

- Access model hyper parameters and create a large tuning space for them

- Choosing the appropriate performance measure related to the problem

- Tune models using resampling strategies and tune control algorithms according to your performance metric of choice

- Select the best performing model based on validation results (No over fitting on the training set) and selecting the best performing variables in the dataset

- Using the trained model to predict on new data

- Gentle introduction to bagging and stacking (If time allows it, else theory)

Agenda and Format:

The following is the agenda for the event. Some changes might occur on the day depending on speakers and finalization of speaking schedule

4:30pm: Arrival and networking

5:00pm: Lightning Talk + Introductions

5:15 / 5:30pm: Talk

6:30pm: Talk

7:30pm: Lightning Talk (Between 1-2)

7:45pm: Announcements and News

8:00pm: Networking and Refreshments


The format will be similar to the previous meet-ups in that we will have 3-4 talks There will be two main talks which will also take a deep dive into code and examples and two others more around Business case and interesting startups or Apps being built around Natural Language Processing.

Meet-up will be 4:30pm on-wards with talks starting at 5:30pm and networking post meet-up as well.

Get an invite here: