PyData @ Cato Networks

Name: PyData @ Cato Networks
Start: 2022-09-19T18:00:00+03:00
End: 2022-09-19T21:00:00+03:00
Location: Azrieli Sarona Tower

Hosted by Uri G. and Adi G.

PyData Tel Aviv

Details

PyData x Cato Networks meetup: Learn from data science experts in person!

September 19 @ 6:00 pm

This month’s meetup is hosted by Cato Networks, the world’s first and most robust single-vendor SASE platform.

Join us for a fun-filled evening of networking and the chance to learn from some of Israel’s top data scientists with PyData! PyData provides a forum to discuss best practices, management, analysis, and emerging technologies in data. IT professionals won’t want to miss this!

What’s on the Agenda?

18:00-18:30 – Food, fun, and photo booth (take your next LinkedIn or corporate headshot at our event for free!)
18:30-18:45 – Welcome | Elad Menahem, Director of Security at Cato Networks
18:45-19:15 – DBSCAN for large-scale datasets and network security applications | Avidan Avraham, Research Team Lead & Asaf Fried, Data Scientist, Cato Networks
19:15-19:45 – Tiny dataset–based topic modeling | Natanel Davidovits,
Senior Manager of Data Science at DoubleVerify
19:45-20:00 – Short break
20:00-20:30 – From big data to small, high-quality data: fulfilling the promise of data-centric AI | Eitan Netzer, Chief scientist at Data Heroes

Grow your knowledge in data science and learn from the experts.
RSVP now to secure your spot!
Cato Networks TLV Offices, Menachem Begin 121, 44th flr

## ============================================

DBSCAN for Large-Scale Datasets and Network Security Applications | Avidan Avraham, Research Team Lead & Asaf Fried, Data Scientist, Cato Networks.

You've probably heard of DBSCAN (Density-Based Spatial Clustering of Applications with Noise), as it is one of the most well-known density-based clustering algorithms. Since it was first introduced in 1996, this field has been extensively studied in academia and successfully applied to many real-world industry applications.

However, due to its high computational complexity, applying DBSCAN to today's large-scale data sets is very challenging. In this talk, you will learn how DBSCAN can be parallelized and executed over distributed processing systems using PySpark. In the end, we will show how we apply these methods to solve network security problems here at Cato.

## ============================================

Tiny-dataset based topic-modeling | Natanel Davidovits,

Sr. Manager of Data Science at DoubleVerify

** Topic modeling on pre-trained embedding space, with a tiny dataset

** The enormous advantages of 1DCNNs over RNNs in NLP tasks

** When character-level models do better than ngram-level, why, and how to design them correctly

** How to handle long text in a modern NLP pipeline

## ============================================

From big data to small, high quality data – fulfilling the promise of data centric AI | Eitan Netzer Eitan Netzer, Chief scientist at Data Heroes.

DataHeroes Chief Scientist will explain and demonstrate how to use a free library to reduce your big data into a much smaller dataset that will maintain the statistical properties and corner cases of your full dataset. Data scientists can then use the smaller dataset for cleaning and labeling their data and training their models. Cleaning and labeling the smaller dataset will result in a much higher quality dataset that can then be used to train the model at a fraction of the time, resulting in a higher quality model, shorter development time and easier model maintenance.

***

PyData Tel Aviv

PyData @ Cato Networks

PyData Tel Aviv

Details

Related topics

You may also like