Data Processing @SCALE


Details
Almost every data backed product has the challenge of collecting, processing and presenting data and it has to be @scale, in the previous meetup we talked about collecting data and its challenges, come hear Iddo Rachlewski (Convertro’s VP of R&D) and Nader Ganayem (Senior DevOps Engineer) and share your deep thoughts on processing data @scale
This meetup will deeply cover the following topics:
-
How to design a robust and scalable data processing architecture
-
Why and how to use and utilize Labmda architecture
-
How to parse hundreds of data formats and integrate with #Hadoop
-
How to deal with out of order and belated data in the processing layer
-
How to manage jobs scheduling, workflows and dependencies
-
Monitoring, tracking and anomalies detection
-
What to do where: Collecting - Processing - Presentation layers
-
Hadoop-R integration and how it saved us 1MM$
-
#Devops Hadoop in the cloud - Yarn vs. Mesos vs. MR1
-
Ops Automation - Chef/Opsworks
-
EMR vs Cluster
#Hadoop #Vertica #R #Kafka #Elasticsearch #ELK #Anomaly-Detection #AWS
#Opsworks, #Chef, #Puppet, #Mesos, #Docker
Presenters:
Iddo Rachlewski, VP of R&D at Convertro (AOL) - https://il.linkedin.com/in/iddor
Nader Ganayem, Senior DevOps Engineer at Convertro (AOL) -

Data Processing @SCALE