spark coding dojo (scala)


Detalles
Due to the limited number of places, please, only register to this event if you are sure you will come.
This session is hands-on. Bring your laptop!!!
Goals: set up a spark installation in your laptop, run several scala exercises in your spark installation.
IMPORTANT: in order to follow all exercises, please follow this instructions:
SETUP: This Setup takes several minutes (about 20min) and will download an important piece of the internet. Plan accordingly.
PREREQUISITES: JVM 1.6 (Recommended 1.8), Maven 3.x+
- Install Spark
Download and unTAR spark-1.2.0.tgz
$ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.2.0.tgz
$ tar xvfz spark-1.2.0.tgz
Build spark locally:
$ cd spark-1.2.0
- Obtain the dataset for the DOJO
We'll be using Wikimedia's dataset representing traffic of several projects in Wikimedia's umbrella. Obtain and unzip the first three files for Jan 1st 2015 listing Wikimedia's traffic. $ wget http://dumps.wikimedia.org/other/pagecounts-raw/2015/2015-01/pagecounts-20150101-000000.gz
$ wget http://dumps.wikimedia.org/other/pagecounts-raw/2015/2015-01/pagecounts-20150101-010000.gz
$ wget http://dumps.wikimedia.org/other/pagecounts-raw/2015/2015-01/pagecounts-20150101-020000.gz
Note: Check MD5's of downloaded files to ensure proper download:
0a0c9fbff017b58fb6b1aa83edafbbf1 pagecounts[masked]-000000.gz
b6f9351fe2c0bc746d[masked]ce2 pagecounts[masked]-010000.gz
2dc1bd0bbde58f954bc[masked]c8d6 pagecounts[masked]-020000.gz
GunZIP all files:
$ gunzip pagecount*
About the organiser:
Ignasi is a regular in several meetups in Barcelona. He is focused on scala and java (the platform and the language) and also interested in the technical aspects of agile methodologies.
Ignasi has been facilitating coding dojos since 2012 and working in scala full time since early 2013.

spark coding dojo (scala)