Skip to content

May Meetup: Where to Deply Hadoop, Recommenders on EC2, Becoming Data Driven

Photo of Alvin Ng
Hosted By
Alvin N. and John F.X. B.
May Meetup: Where to Deply Hadoop, Recommenders on EC2, Becoming Data Driven

Details

Three speakers this month:

Where to deploy your BigData stack: Bare-metal or Cloud?

Abstract:

Traditionally, data infrastructure has been built with inflexible and monolithic scale-up models which primarily catered to data-storage and data-access requirements. BigData technologies have changed this paradigm by integrating a layer of complex data processing into the traditional model. BigData infrastructure is also expected to support multifold increase in volume of data compared to the traditional data infrastructure. With BigData technologies the focus has shifted from traditional inflexible scale-up infrastructure to highly flexible scale-out infrastructure. This ‘flexibility’ and ‘scale-out’ nature of BigData infrastructure opens up possibility for deployment of the BigData stack in Cloud.

In this session, I will talk about our experience with deploying our BigData stack in cloud; the challenges we faced and the benefits we derived from this deployment. We would highlight major differences between deployments of BigData stack in bare-metal vs. cloud infrastructure from the perspective of bare-metal Hadoop vs. Amazon EMR.

Sarang’s Bio:

Sarang has 11+ years of experience in software development of Java and BigData Analytics systems. As a BigData architect, he has built robust BigData systems including next generation contextual communication systems and data platform for IoT. He has filed 2 patents for his innovations in BigData Analytics space. Currently he is part of Autodesk, Singapore as Tech Lead and Technical Product Owner of their BigData Platform.

Building Large Recommenders for Generation Me (and How That Will Change the World)

Abstract:

This is part I of a two part series on large scale recommenders. In this part, we will focus on the infrastructure stack (Spark, Hadoop, EC2) that will enable us to build and scale out recommender systems cheaply. We will also go through a simple movie recommender case study that will highlight the challenge of building a recommender for generation me. Part II of the series that covers the data science techniques will be presented in the upcoming PyData meetup in June.

Link to slides: https://docs.google.com/presentation/d/1YMPJGo62hRPDz1iXQK2fITJAFOJn6H3oPD5vsrMwA1c/edit?usp=sharing

About the speaker:

Kai Xin is a data scientist at Lazada and specialize in behavioral analytics. He has been building behavioral models for 3 years in areas like shopper insights, fraud models, social network analysis, population health segmentation, demand forecasting and is the top 1% on Kaggle, an international data science competition portal.

Becoming Data Driven: The Bumpy, Twisted Road from Commitment to Competence

Abstract:

It takes more than hiring a Data Scientist to become data driven. It's a business transformation that requires rethinking how a company produces, consumes and thinks of data.

To become data driven, you need to align strategy, process and systems.

This is a lightweight overview of how Lazada, SE Asia's largest ecommerce company is transforming it's data strategy.

About the Speaker:

John Berns is co-founder and co-organizer of BigDataSG/HadoopSG. To pay the bills, in his spare time, he doubles as SVP, Head of Data Science at Lazada.

Photo of Forward League group
Forward League
See more events
IDA Labs @ NDC National Design Centre
111 Middle Road, #03-04 Singapore 188969 · Singapore