Past Meetup

DataTalks #2: Markov blankets for feature selection & app mentions in texts

This Meetup is past

150 people went

Location image of event venue



DataTalks ( #2: Markov Blankets and App Mentions in Texts

A rough schedule for DataHack ('s second meetup:

• 18:00 - 18:15 - Gathering, snacks, mingling

• 18:15 - 18:20 - Opening words

• 18:20 - 19:10 - First talk:
Avishay Meron, PayPal - On feature selection: Key ideas and utilization in fraud analysis

• 19:10 - 19:20 - A short break

• 19:20 - 20:10 - Second talk:
Doron Kukliansky, Facebook - App mentions in texts

==== Talk #1 ===

Speaker: Avishay Meron, PayPal
Title: On feature selection: Key ideas and utilization in fraud analysis
Abstract: Feature selection has been a fertile field of research since the 70’s and proven to increase efficiency and accuracy in learning tasks. In the past decade data has become increasingly larger in both number of instances and number of features. This enormity poses sever challenges with respect to scalability and learning performance. Since the task of feature selection is NP-hard, we are left to approximate a good solution using various heuristics. In this talk we review key ideas and try to sketch guide lines on which heuristic should we follow given a learning task. In addition, we present a utilization of Markov Blankets feature selections for fraud analysis.

==== Talk #2 ===

Speaker: Doron Kukliansky, Facebook
Title: App mentions in texts
Abstract: As people move further away from desktop usage and spend more of time on their mobile devices, mobile apps are changing the way we interact with the Internet. But how can we identify which apps are really trending and why? This technical talk will discuss the implementation details of a small identification engine that identifies when mobile apps are mentioned in Facebook posts and are covered in the media. We will start from a simple idea and develop it, step by step, to reach our final algorithm. We will use only basic concepts from probability, statistics, machine learning and NLP, but dive deeper into their meaning and applications, to gain additional insights into the problem.

DataHack ( is a data-driven community and annual hackathon for data-enthusiast programmers, researchers and designers. You can find out more on DataHack 2016 ( on our new website (

You can also find us on Facebook ( and twitter (, and join our monthly newsletter (