This is our first Apache Spark Meetup in Munich. In this Meetup we will explain some Spark basics and show a small live demo in the first Talk by Danny. The second talk will be about implementing a ML based answer scoring with Spark and MLLib.
Between and after the talk we have time for drinks, food and conversations.
! The talks will be in German !
"Spark Basics - RDD, SQL, Mllib, GraphX" - Danny Linden
"With the rapid adoption of Apache Spark—one of the most active Apache projects today—and the need for programs to solve the world's greatest problems, distributed computing has resurfaced as a hot commodity that can take your career to the next level. More importantly, Spark opens the door to some really cool and impactful applications. Spark is a leap forward in distributed computing, allowing you to perform faster and more complex analyses on your Hadoop cluster and in the cloud. This presentation will give a short introduction to basic Spark concepts such as RDDs, transformations, actions, and executors. We will also cover recent developments in the Spark community with DataFrames, SQL on Spark, GraphX."
"Speed Up Your Spark Job" - Christian Dedié
gutefrage.net is using spark extensively for Maschine Learning, BI and realtime processing of user behavior. This talk is about our learnings and pitfalls when implementing a ML based answer scoring (ordering) for all questions on gutefrage.net. E.g. improve reliability and throughput of spark jobs with read/write access to relational datasources, or optimize HDFS based data structures for best performance. Starting with Dataframes and Spark SQL, we experienced some major improvements when implementing the same functionality based on RDDs.
About Christian Dedié:
Christian Dedié has 20 years of experience as a software engineer. He's a passionate Scala developer and Continuous Delivery advocate. In the last years he focused on big data projects using Polyglot Persistence and Maschine Learning. He is co-founder of the open source project "Flyway - Database Migrations Made Easy".
Hi Spark-Munich members,
We are happy to announce our first Meetup date. The "Big Data Munich" Meetup Group ( http://www.meetup.com/de/Big-Data-Munich/events/226100767/ ) has invited us as guest group on the Big-Data Munich Meetup.
No one less than Sean Owen, one of the leading Spark developers from Cloudera, the first, and one of the leading provider and supporter of the Apache Hadoop Stack, will join the Meetup and talk about „A taste of random decision forests on Apache Spark“.
I am pleased that Cloudera provides one of there best Spark Devs and I'm looking forward to upcoming great Meetups pairing informative talks with community exchange.
Additionaly there are going be two more great talks, the whole agenda is as follows:
7:00 - 7:15 PM: Drinks & Networking 7:15 - 7:35 PM: Christian Löhnert, Pre- Sales Consultant at ConSol* Consulting & Solutions Software GmbH
"Where are the users? - A (simple) story about getting started with Big Data"
7:35 - 8:05 PM: Sean Owen, Director Data Science at Cloudera
"A taste of random decision forests on Apache Spark" 8.05 - 8:15 PM: Matthias Korn, Technical Consultant at Data Virtuality
"Beyond the Data Lake"
Thanks so far to all of you and i'm looking forward to see you all at the 12th of November!
Our first dedicated "Spark-Munich Kick-Off" Meetup will be held in the first week of December. The exact Date will be announced asap.