Intro to Apache Spark

Hosted By
Portia B.

Details
Apache Spark is a popular distributed framework for data exploration, analysis and building big data applications. It aims to be useful for real-time applications, complex analysis, interactive queries and batch processing. In this talk we'll provide a high level overview of Spark, compare it to other similar systems, discuss its fundamental features, show a few examples using the Spark Python shell and provide an overview of popular packages and algorithms. No prior knowledge of Spark or Hadoop is required though basic Python (or Scala) knowledge would be useful. Julio is an independent programmer in Portland, who loves working with data in Clojure and Python.

Portland Data Science Group
See more events
Little Bird
506 SW 6th #200 · Portland, OR
Intro to Apache Spark