Workshop - Data Science In the Cloud Using Amazon


Details
Zaponet - data science solution is proud to host half a day workshop about data science in the cloud
Co-sponsored with the Israeli Statistics Association
Audience: data scientists, data engineers and sysadmins
Agenda:
09:30 – 10:30 AWS Overview:
Presenter: Guy Ernest
(*) The reason behind Amazon Web Services (unified API, Global infrastructure, pay for what you use..)
(*) The many AWS services: S3, EC2, EMR, EBS, AMIs, DynamoDB, SQS
(*) The AWS pricing model (on demand, reserved instances and spot instances)
10:30 – 11:30 Data Analysis Workflow
Presenter: Jonathan Rosenblatt
(* )Workflow of analyzing data using R (or the software of your choice) on a remote cloud machine: connecting to the machine using RStudio server, uploading data, downloading results, scaling up and down on the fly.
(*) Why a cloud machine?
(*) Why a cluster of cloud machines?
11:30 – 11:45 Coffee Break
11:45 – 12:45 setting Up An Analysis Environment:
Presenter: Guy Ernest
(*) Gathering logs
(*) Analyzing sample logs: Setting up a single machine: launching an instance, firewall (RStudio server uses port 8787), connecting to the machine (remote desktop & ssh) creating an AMI, upgrading the machine, using EBS and S3,sharing AMIs,
12:45 – 13:45 Setting Up The Analysis Environment –Advanced Topics:
Presenter: Guy Ernest
(*) Running MapReduce cluster process on the logs - how to optimize your EMR cluster
(*) Setting up a redshift cluster: Copy clean logs data into the cluster
(*) Connecting to the cluster with the above RStudio instance and also BI tool (Tableau)
(*) Managing costs.
Bio:
Guy Ernest is part of the solutions architecture team of Amazon Web Services, where he helps customers with their first and advanced steps in the cloud.
Guy specialized in Big Data, Analytics and Machine Learning, thanks to his background in these fields prior to joining AWS. Guy founded a couple of start up companies in personalization, mobile search and big data analytics.
Jonathan Rosenblatt is a postdoctoral researcher in the The Faculty of Mathematics and Computer Science at the Weizmann Institute of Science.
Jonathan uses the AWS cloud on a regular basis to analyze neuroimaging and genetic data, typically with R.
http://photos1.meetupstatic.com/photos/event/c/c/c/6/600_304072422.jpeg

Workshop - Data Science In the Cloud Using Amazon