What we're about

The Haifa branch of Datahack is meant to facilitate learning and discussion on various topics in data science, machine learning and statistics, with focus on the people and the companies working and doing research in Haifa and the area.

We aim to do these using several initiatives, starting with the Haifa flavor of the DataTalks meetup series, a series for co-learning and other future initiatives.

Upcoming events (1)

DataTalks HFA #1: Deep Dominance and Text Summarization

IBM Research Labs Haifa

Welcome to the first DataTalks HFA meetup! (: DataTalks HFA will bring together local data scientists, researchers, statisticians and data enthusiasts around advanced data science, machine learning and AI topics, and of course - around some snacks (; Our first meetup is hosted by IBM Research who will share with us the fascinating document-shrinking research behind their new summarization engine. Next, "Deep Dominance": Rotem Dror from the Technion will share with us how to properly compare deep neural models. Time: September 2nd, 18:30 Language: Hebrew (both talks) Location: IBM Research labs (type in Waze), University of Haifa Campus, 165 Abba Hushi rd., Haifa Background: Basic knowledge in data science and machine learning is required for understanding - 'seminar level' *** Free parking is available for DataTalks HFA attendees, following the meetup signs and buzzing the intercom *** --- Please RSVP by August 30th due to attendee limit --- Agenda: • 18:30 - 19:00 - Gathering, snacks & mingling 🍕 • 19:00 - 19:45 - First talk: Rotem Dror (Technion) - Deep Dominance: How to Properly Compare Deep Neural Models 💡 • 19:55 - 20:40 - Second talk: David Konopnicki (IBM Research) - Honey, I Shrunk the Docs - Getting to the gist of things with Summarization 💾 ** Deep Dominance: How to Properly Compare Deep Neural Models - Rotem Dror ** Comparing between Deep Neural Network (DNN) models based on their performance on unseen data is crucial for the progress of the NLP field. However, these models have a large number of hyper-parameters and, being non-convex, their convergence point depends on the random values chosen at initialization and during training. Proper DNN comparison hence requires a comparison between their empirical score distributions on unseen data, rather than between single evaluation scores as is standard for more simple, convex models. In this talk, we present a way to adapt to this problem a recently proposed test for the Almost Stochastic Dominance relation between two distributions, and show, both theoretically and through analysis of extensive experimental results with leading DNN models for sequence tagging tasks, that the proposed test meets all criteria while previously proposed methods fail to do so. ** Honey, I Shrunk the Docs - Getting to the gist of things with Summarization - David Konopnicki ** Information overload is the plague of our time. We literally crumble under the mountain of documents we must read, process and digest every day. Search engines that return hundreds of ״blue links״ as results to our searches just add to the stress. What if there was a solution? A system able to analyze documents, understand the points of interest, extract them and combine them into a summary of just the information we need. In this talk, we will describe how it is possible to do just that: focused summarization of multiple documents. We will describe the technology, the challenges and potential use cases in the context of the new summarization engine for scientific articles we just deployed at http://ibm.biz/sciencesum

Past events (5)

Photos (6)