Skip to content

Spam detection with Kafka + Samza & “Your Data Isn't That Big”

Photo of Demi Ben-Ari
Hosted By
Demi B. and shlomi h.
Spam detection with Kafka + Samza & “Your Data Isn't That Big”

Details

18:00 - 18:30 - Mingling
18:30 - 19:30 - Near real time stream processing with Apache Kafka and Apache Samza – Spam detection use case - Michael Sklyar (https://www.linkedin.com/in/sklyarmichael), Infrastructure Team Lead @Cyren
19:30 -20:15 - Your Data Isn't That Big - Big data processing with bash scripting via command line - Boaz Menuhin (https://www.linkedin.com/in/boaz-menuhin-3481b413) - Sr. Software Engineer @ Crosswise (Oracle)

https://lh5.googleusercontent.com/TlWDsn9hcToB10mPCBVyaj198gWzzXNMWAcenfPDwiYBnSTbXey4WUfZUZ42G_xtd4u0_iLbVV3QTBzNqYXf2HGGlT0DQ-TCFDswAD58gZavvjcMD2N72yAKmcrMHb2-RrSgbudrhttps://lh4.googleusercontent.com/CYKi15PDi6GSI76pQEnd4l7H1FxzZRIQxcHG5A-PdYR00XtZPUWP8DVaEn5j_fdhNinh8ccQQZydNith-vOx5JjPKDW8F03qQaLjklD4v4IFoAEVCM8P80ag38IWNYxQRRoMdKC_https://lh4.googleusercontent.com/ecOoXfpaZFC2ZMByuvxiAQ0qUzjfzUwmorFY-IWibviJ1OWaMR06Z2Ba1SS7Hr2pm9KbDDa95HdAbGIU5ifmYk-ISbE5QfCeGR4Un45efJa_jAtUb4yAVuCBokb4Kizu2MBdyUye

“Near real time stream processing with Apache Kafka and Apache Samza – Spam detection use case”

Abstract:

In Cyren we deal with serious amounts of data. Our team mission was to rewrite our anti-spam legacy NRT detection stream processing layer. The system is processing billions of transactions/day while every second counts in order to protect our (your!) mail boxes.

In this session I would like to present our use case, the technology decisions, development experience and the results (solid numbers!).

I aim to cover general stream processing concepts such as back-pressure, at least once/exactly once processing, state management, windowing, partitioning.

I will present how these concepts are solved with Apache Samza and, when appropriate, compare to other stream processing framework – Apache Storm.

Bio:
Michael Sklyar (https://www.linkedin.com/in/sklyarmichael), Infrastructure Team Lead @ Cyren.

I have over 15 years of experience in SW. After a few years of Project Management in Telecom industry, I am happy to be back building systems in R&D.

I am passionate with design & architecture, big scale systems and massive amounts of data.

“Your Data Isn't That Big - Big data processing with bash scripting via command line”

Abstract:

Bash scripting and command line utils can be used as powerful tools for many big-data tasks. In some cases using command-line can run faster and more efficiently than running a MapReduce job. In this talk I will cover the scenarios in which one should consider using command line instead of Hadoop and cover available tools and recommended usage.

Bio:
Boaz (https://www.linkedin.com/in/boaz-menuhin-3481b413) is a Software engineer with +10 years of experience. Enjoying prototyping, cost reduction, solving Big Data problems but mostly enjoying solving problems which requires theoretical computer science knowledge.

Speaks fluent Python and bash scripting is a friend of mine. Was one of the first Crosswise employees (acquired by Oracle) and worked for some cyber security companies.

Photo of Big Things group
Big Things
See more events
Klarna Office
Yigal Alon 98, Tel Aviv (Electra building), floor 13 · Tel Aviv-Yafo