1) How does that PySpark thing work? And why Arrow makes it faster? by Rubén Berenguel (@berenguel) (English,45 min)
Back in ye olde days of Spark, using Python with Spark was an exercise in patience. Data was moving up and down from Python to Scala, being serialised constantly. Leveraging SparkSQL and avoiding UDFs made things better, as well as the constant improvement of the optimisers (Catalyst and Tungsten). But, with Spark 2.3 PySpark has speed up tremendously thanks to the (still experimental) addition of the Arrow serialisers.
In this talk we will learn how PySpark has improved its performance in Apache Spark 2.3 by using Apache Arrow. To do this, we will travel through the internals of Spark to find how Python interacts with the Scala core, and some of the internals of Pandas to see how data moves from Python to Scala via Arrow.
2)Frozen Python: satice ice dataloggers by Oriol Sánchez (English,35 min)
Project website: https://headingnorthweb.wordpress.com/tag/en-us/
In this talk we will be about Sea Ice buoys and embedded scientific observation platforms in extreme environments.
We will learn how to use python to write drivers, custom data protocols, sensors and GPS integration. Also we will take a look on data visualisation and other painful engineering stories to bring data from field to your server.
# Access Control
- The security access control requires an attendee list. Provide your full name in your profile, otherwise you will be REMOVED from the list.
- There are limited seats, please be responsible when signing up. If you can't make it, please free up your seat, so someone else can attend. We'll notice repeat offenders ;-)
- We need talk proposals! Send yours: http://pybcn.org/call-for-proposals/
- Wanna publish a job offer? https://www.meetup.com/python-185/pages/24606847/Job_offers/
- Follow @PyBCN for pictures, slides and more: https://twitter.com/pybcn