Skip to content

Intro to Apache Spark

Photo of Portia Burton
Hosted By
Portia B.
Intro to Apache Spark

Details

Apache Spark is a popular distributed framework for data exploration, analysis and building big data applications. It aims to be useful for real-time applications, complex analysis, interactive queries and batch processing. In this talk we'll provide a high level overview of Spark, compare it to other similar systems, discuss its fundamental features, show a few examples using the Spark Python shell and provide an overview of popular packages and algorithms. No prior knowledge of Spark or Hadoop is required though basic Python (or Scala) knowledge would be useful. Julio is an independent programmer in Portland, who loves working with data in Clojure and Python.

Photo of Portland Data Science Group group
Portland Data Science Group
See more events
Little Bird
506 SW 6th #200 · Portland, OR