In this half day workshop, I will provide an introduction to how data flows through a typical big data pipeline. I will discuss how queries, clicks and views are instrumented in various stages and later combined and cooked - eventually spawning off multiple structured and unstructured data streams. I will also briefly mention how this data is used to produce different metrics, dashboards, predictions and segments. Most importantly, I will explain what, when, where and how data science and predictive analytics techniques are used in the pipeline.
Towards the end, I will discuss the technical and soft skills needed to become an effective data scientist.
• Understand how data science is used by Amazon, Bing, Google, FB and LinkedIn.
• Provide an end to end understanding of a big data pipeline
• Provide a gentle introduction to the techniques that fall under the broader category of data mining, predictive modeling and data science. We will discuss examples from online search, advertising, retail, insurance, social networks, entertainment, education, healthcare, telecommunication and law enforcement.
• Provide an overview of some of the common data mining tasks like regression, classification, clustering, association analysis and outlier detection – without going into theoretical details of individual techniques.
• Discuss the technical and soft skills that are needed to become an effective data scientist.
About The Instructor:
Raja has worked in various research and development roles at Microsoft Online Services Division. During his tenure, he worked on various cutting edge techniques that deal with various problems in paid search marketplace, online advertising, relevance in online retrieval, data mining at large scale, predictive analytics and online experimentation.
At Microsoft, Raja has been a regular speaker at various tech-talks and tutorials. He delivered a lecture series titled ‘Introduction to Machine Learning’ that has been a recommended resource for new Microsoft OSD employees for many years. He has also given talks on predictive modeling, R programming, online experimentation and A/B testing, relevance in online systems and online advertising. Raja has published his work on object detection, DNA classification, face detection and texture classification in peer reviewed journals and conferences. He has also served as reviewer for various journals and conferences in machine learning, data mining, artificial intelligence and large scale online systems.
In January 2013, Raja quit Microsoft after catching the entrepreneurship bug. He is currently working on his startup and his data science training/consulting company.
A genuine interest in big data and data science.
$25. [To cover the cost of reserving the facility and have a budget to host future events for our meetup.]
Bellevue College. Right off exit 11B on I-90 Goals of the workshop: