This meetup group is for those interested in learning, sharing and exploring about Apache Spark, PySpark.
Apache Spark is a lightning fast engine for large-scale data processing and the leading candidate as a successor to Map-Reduce. Spark is a general purpose in-memory cluster computing framework. Spark lazily evaluates and optimizes execution plans and smartly handles memory usage. It can run on Hadoop's resource manager and read any existing Hadoop data. It furthermore provides rich APIs in Scala, Java and Python.
PySpark is Python wrapper around Spark.
Come let's start learning about Spark and PySpark.