• pandas on Apache Spark (ft Databricks cofounder) / AI-powered musical creativity

    Our next event features Reynold Xin, co-founder of Databricks, telling us how we can finally make pandas-based projects scalable using Databricks' new open source "Koalas" project. We'll follow that up with Google's Hanoi Hantrakul telling us about "Magenta", Google's initiative to use AI to enhance human creativity (with special live musical demos). Summer is here and the SF PyData meetup group is heating up! This one's looking like a scorcher. As always, we'll also have time for you to meet, mingle, and discuss all things data with other Bay Area Python and data enthusiasts! ## Agenda: 6:00 - 6:45pm: Mingling 6:45 - 6:50pm: Opening remarks 6:50 - 7:35pm: Tech-Talk-1: Koalas: pandas APIs on Apache Spark 7:35 - 8:05pm: Tech-Talk-2: Magenta: Exploring the role of machine learning in creativity 8:05 - 8:30pm: Mingling 8:30: Event over! Many thanks to Microsoft for volunteering to host and Databricks for Sponsoring. Doors will close at 7:15 so please arrive before then. ## Koalas: pandas APIs on Apache Spark #### Abstract In this talk, Reynold will present Koalas, a new open source project that was announced at the Spark + AI Summit in April. Koalas is a Python package that implements the pandas API on top of Apache Spark, to make the pandas API scalable to big data. Using Koalas, data scientists can make the transition from a single machine to a distributed environment without needing to learn a new framework. Reynold will demonstrate Koalas' new functionalities since its initial release, discuss its roadmaps, and how he envisions Koalas could become the standard API for large scale data science. #### Speaker Bio Reynold Xin is a cofounder and Chief Architect at Databricks. In the open source community, Reynold is known as a top contributor to the Apache Spark project, having designed many of its core user-facing APIs and execution engine features. Reynold received a PhD in Computer Science from UC Berkeley, where he worked on large-scale data processing systems. ## Magenta: Exploring the role of machine learning in creativity #### Abstract: Hanoi will be giving an overview on Magenta, an open source team at Google led by Doug Eck, exploring the role of machine learning in the creative process. The talk will cover exciting and recent developments in the world of AI-generated art as well as live musical demos! #### Speaker Bio: Lamtharn (Hanoi) Hantrakul - AI Resident, Google Brain Hanoi - like the towers? No. Like the city? Yes! Hanoi's Thai parents fell in love there and nicknamed him after the charming city. Born and raised in Bangkok, Thailand, he is currently an AI Resident with Google Brain. His research focuses on real-time Neural Audio Synthesis: rendering sound directly using deep neural networks. Hanoi strives for technologies that are transcultural at heart; diversifying software, hardware and AI to encompass underrepresented cultures such as those from his home region of Thailand and beyond. He is most proud of fidular, a modular fiddle system he designed and engineered that enables components like resonators and strings to be swapped across cultures. The system is currently on display at the Musical Instruments Museum in Phoenix, AZ, and has been recognized internationally by the A’ Design Award and Core77 Design Award. Hanoi hold degrees in Applied Physics and Music Composition, both with Distinction from Yale University. In his MSc thesis at Georgia Tech, he developed machine learning models for an ultrasound sensor that enables amputees to perform high-dexterity tasks like playing piano; an impossible feat using today’s sensors on conventional prosthetics. He also writes music under the moniker "yaboi hanoi". Find his tunes on Instagram and Spotify!

    10
  • Troubleshoot Deep Learning Models / Simple multi-user Jupyter Notebooks

    The SF PyData meetup group is kicking off our 2019 season with a bang! 1. Jupyter notebooks have become an essential tool for all types of data analyses. Jupyter project contributor Yuvi Panda will teach us the simplest way to jump straight into using and sharing Jupyter notebooks, without the pain of wrestling with environments and setup. 2. Deep learning continues to surprise and delight with its ability to solve thorny problems and achieve impressive results. But neural networks aren't exactly known for their transparency and ease-of-use. OpenAI research scientist (and UC Berkeley phD student) Josh Tobin will teach us the simplest way to debug misbehaving deep learning models and get your analysis back on track. As always, we'll also have time for you to meet, mingle, and discuss all things data with other Bay Area Python and data enthusiasts! ## Agenda: 6:00 - 6:45pm: Mingling 6:45 - 6:50pm: Opening remarks 6:50 - 7:35pm: Tech-Talk-1: Simple multi-user Jupyter Notebooks with The Littlest JupyterHub 7:35 - 8:05pm: Tech-Talk-2: How to Troubleshoot Your Deep Learning Models 8:05 - 8:30pm: Mingling 8:30: Event over! Many thanks to Cloudflare for volunteering to host. Doors will close at 7:15 so please arrive before then. ## Simple multi-user Jupyter Notebooks with "The Littlest JupyterHub" #### Abstract JupyterHub is a multi-user notebook server, allowing any number of users to access their own Jupyter-based data science environment without having to download or install anything. This is extremely useful when teaching classes or conducting trainings - you can skip straight to your content without having to get people to install Anaconda or mess around in a cloud console. Analytics teams also benefit from this, since they can easily access the data they are analyzing and share work between themselves. The Littlest JupyterHub (tljh.jupyter.org) is a JupyterHub distribution that makes it easy to setup and maintain a JupyterHub with minimal long term effort. This talk will demo a quick setup, and various use cases for a small JupyterHub. #### Speaker Bio Yuvi Panda is a contributor to Project Jupyter, and currently works at UC Berkeley building & running JupyterHubs for their data science courses. He also is part of the team that runs the open infrastructure behind mybinder.org. He primarily works on removing accidental complexities from people's workflows so they can focus on the tasks they wanna do rather than fiddle with unnecessary 'computer stuff'. Prior to this he was part of the Wikimedia community, working on enabling Wikimedia volunteers to build bots and run analysis as easily as possible. ## How to Troubleshoot Your Deep Learning Models #### Abstract: Deep learning practitioners spend most of their time troubleshooting & debugging. Troubleshooting models is notoriously difficult because the same performance problem can be attributed to many different sources, and performance can be extremely sensitive to small changes in architecture and hyperparameters. In this talk, I will attempt to demystify the troubleshooting process by presenting a decision tree for improving your model's performance. #### Speaker Bio: Josh Tobin is a Research Scientist at OpenAI and a PhD student in Computer Science at UC Berkeley working with Professor Pieter Abbeel. Josh's research focuses on applying deep learning to problems in robotic perception and control, with a particular concentration on robotic manipulation, deep reinforcement learning, generative models, and synthetic data. Prior to Berkeley and OpenAI, Josh was a management consultant at McKinsey & Co. in New York. Josh has a BA in Mathematics from Columbia University.

    8
  • Manage the Machine Learning Lifecycle with MLFlow / Effective Data Visualization

    After a long hiatus, the SF PyData meetup group is BACK! Starting Sept 6th, we're going to return to having regular, in-person meet ups in San Francisco. The usage of Python and PyData tools has grown explosively over the last few years, and we're very excited to start building a community where data lovers from all across the Bay Area can meet, connect, and learn from each other. To kick off the new series we've got a great line up of speakers who'll teach us about 1) managing the complete machine learning lifecycle and 2) producing clear, effective visualizations of scientific data. ## Agenda: 6:00 - 6:45pm: Mingling 6:45 - 6:50pm: Opening remarks 6:50 - 7:35pm: Tech-Talk-1 MLflow: Infrastructure for a Complete Machine Learning Life Cycle 7:35 - 8:05pm: Tech-Talk-2: Data Visualization for Scientific Discovery 8:05 - 8:30pm: Mingling 8:30: Event over! Many thanks to Cloudflare for volunteering to host. Doors will close at 7:15 so please arrive before then. ## MLflow: Infrastructure for a Complete Machine Learning Life Cycle Abstract: ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but these platforms are limited to each company’s internal infrastructure. In this talk, we will present MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size. Speaker Bios: Mani Parkhe is an ML/AI Platform Engineer at Databricks, working on customer facing and open source platform initiatives to enable data discovery, training, experimentation, and deployment of ML models on the cloud. He has also worked on various data intensive batch and stream processing problems at LinkedIn and Uber. Andrew Chen is a software engineer at Databricks and a MLflow committer. Andrew is working on tools to simplify the end to end experience of machine learning, all the way from data ETL to model training and deployment. Before working at Databricks, Andrew received his BS in EECS from UC Berkeley in 2016. ## Data Visualization for Scientific Discovery Abstract: Choosing the visual form for a visualization is a decision about what aspects of the data matter most. Highlight or ignore outliers? Look at values, differences, or changes? In data analysis we risk missing discoveries by failing to notice important features of our data, yet we often use default parameters and charts without realizing what we might miss. I will demonstrate how to translate questions about your data into chart parameters, taking into account your context, goals, and constraints. Using Python examples, I'll illustrate powerful techniques like using color intentionally, creating 'small multiples' of charts that vary visual form or data, and optimizing for your time, energy, and attention. Speaker Bio: Zan Armstrong is a data visualization engineer and designer. With a background in data analysis, she is especially fascinated by identifying what characteristics of the data might be most important and then creating ways to reveal those characteristics visually. She has also won an Information is Beautiful award for work published in Scientific American and a tool she worked on was part of SF Moma's Designed in California exhibit. Zan's primary tools includes Javascript, R, and Python.

    11
  • Live Webinar:HOW TO BECOME A EXPERT ON APACHE SPARK AND SCALA

    Hello. What makes the developers enable confidently start programming in Scala. Join to learn more about Apache Spark and scala (http://unbouncepages.com/live-apachespark-webinar/)' Timings:- 20th October Thursday 2016, 09:00 PM - 10:00 PM IST 20th October Thursday 2016, 11:30 AM - 12:30 PM EDT Focus On:- * INTRODUCTION TO BIG DATA * INTRODUCTION TO SPARK * WHY SPARK * SPARK ECOSYSTEM * INTRODUCTION TO SCALA * PRACTICAL'S ON SPARK This promises to be an extremely enriching session and we hope you can make it - Register Now (http://unbouncepages.com/live-apachespark-webinar/) In case you can't make it sign-up anyway, we'll send you the recording. [ http://unbouncepages.com/live-apachespark-webinar/ ] Cheers!

    1
  • Attend Free Live Webinar on Pig vs MapReduce by Industry Expert

    Hello, Please join us as Industry Experts from KRATOES (http://www.kratoes.com/) share their experience and provide a live demonstration on the topic "PIG VS MAPREDUCE (http://unbouncepages.com/pigvsmapreduce/)" We'd like to invite you for an expert live Webinar on 'Pig vs. MapReduce (http://unbouncepages.com/pigvsmapreduce/)' scheduled on 20th Oct Thursday 2016, 9 PM to 10 PM IST . TOPICS *INTRODUCTION TO BIG DATA AND IT'S CHALLENGES *INTRODUCTION TO HADOOP AND IT'S CHARACTERISTICS *HADOOP ECOSYSTEM *HADOOP IMPORTANCE AND IT'S COMPONENTS *HDFS ARCHITECTURE *HISTORY OF PIG *WHAT IS PIG AND WHY PIG *PIG VS. MAPREDUCE *FEATURES OF PIG AND IT'S APPLICATION *PIG DATA MODEL AND PIG OPERATIONS This promises to be an extremely enriching session and we hope you can make it - Register Now (http://unbouncepages.com/pigvsmapreduce/) In case you can't make it sign-up anyway, we'll send you the recording. [ http://unbouncepages.com/pigvsmapreduce/ ] Cheers

  • Free Live Webinar On AngularJS with Industrial Expert

    Hello, Webinar Invitation: Here's Everything You Need to Know About Working with AngularJS (http://unbouncepages.com/angularjs/).! Boost up your Skills with Apache Spark and Scala Webinar Completely Free TOPICS: • HISTORY OF ANGULARJS • INTRODUCTION OF ANGULARJS • FEATURES OF ANGULARJS • SINGLE PAGE APPLICATION AND ITS CHALLENGES • WHAT IS DATA BINDING, DIRECTIVES AND FILTERS • WHAT IS MVC AND MVVM • PRACTICALS In case you can't make it sign-up anyway, we'll send you the recording. This promises to be an extremely enriching session and we hope you can make it - Click here on Register Now (http://unbouncepages.com/angularjs/) and find the link to join the webinar If you have any additional questions or require further clarification, please, do not hesitate to Call me on [masked] or send me an Email. Cheers !

  • Free Live Webinar On R Programming with Industrial Expert

    Hello, Webinar Invitation: Here's Everything You Need to Know About Working with R-Programming (http://unbouncepages.com/r-programming/) Boost up your Skills with Apache Spark and Scala Webinar Completely Free TOPICS: • Application of R through use-case and examples • Predictive Analytics and its process • Three pillars of Predictive Analytics • Applications of Predictive Analytics • Doubt session In case you can't make it sign-up anyway, we'll send you the recording. This promises to be an extremely enriching session and we hope you can make it -Click here on Register Now (http://unbouncepages.com/r-programming/) and find the link to join the webinar If you have any additional questions or require further clarification, please, do not hesitate to Call me on [masked] or send me an Email. Cheers !

  • Attend Free Live Webinar on Apache Spark & Scala by Industry Expert

    Hello, We would like to cordially invite you to the free live Webinar on 'Apache Spark & Scala (http://unbouncepages.com/apachespark-scala/)' Scheduled on 05th October 2016, Wednesday 07:00 AM To 08:00 AM IST. Boost up your Skills with Apache Spark and Scala Webinar Completely Free TOPICS: • INTRODUCTION TO BIG DATA • INTRODUCTION TO SPARK • WHY SPARK • SPARK ECOSYSTEM • INTRODUCTION TO SCALA • PRACTICAL'S ON SPARK KEY FEATURES OF SPARK: • FAST ANALYTICS • REAL-TIME STREAM PROCESSING • FAULT TOLERANT • POWERFUL AND INTEGRATED DATA PROCESSING • EASY TO USE In case you can't make it sign-up anyway, we'll send you the recording. This promises to be an extremely enriching session and we hope you can make it - Register Now (http://unbouncepages.com/apachespark-scala/) If you have any additional questions or require further clarification, please, do not hesitate to Call me on [masked] or send me an Email. Cheers !

  • Attend Free Live Webinar on Hadoop Introduction via Hive by Industry Expert

    Hello, We would like to cordially invite you to INTRODUCTION TO BIGDATA AND HADOOP VIA HIVE (http://unbouncepages.com/webinar-haoop-hive/) What you'll Learn: • INTRODUCTION TO BIG DATA • CHALLENGES OF BIG DATA AND INTRODUCTION TO HADOOP • INTRODUCTION TO HADOOP AND ITS CHARACTERISTICS • INTRODUCTION TO HIVE AND ITS FEATURES • HIVE COMPONENTS AND ITS ARCHITECTURE • HIVE OPERATIONS KEY FEATURES OF SPARK: • FAST ANALYTICS • REAL-TIME STREAM PROCESSING • FAULT TOLERANT • POWERFUL AND INTEGRATED DATA PROCESSING • EASY TO USE It is Scheduled for 28th September 2016, Wednesday 09:30 PM To 11:00 PM EDT. (http://unbouncepages.com/webinar-haoop-hive/) In case you can't make it sign-up anyway, we'll send you the recording. This promises to be an extremely enriching session and we hope you can make it -Register Now (http://unbouncepages.com/webinar-haoop-hive/) If you have any additional questions or require further clarification, please, do not hesitate to Call me on [masked] or send me an Email ([masked]). Cheers !

  • Live Classes on Apache Spark and Scala(Upgraded by industrial experts)

    Hello, We would like to cordially invite you to Apache Spark and Scala Classes (http://unbouncepages.com/apachespark-and-scala-class/) Featuring 30 Hours of Led-Live Online Training, Advanced Concepts, Practical Tutorials, Hands-on Practices, Installation Manuals, ClassRecordings, Data Sets, Real-Time Industry Projects, Vitual Machine Kit, Hadoop Practical Course, 24/7 365 days Support and Certificate ! The Following Topics will be Covered:- · Introduction To Sparks & Scala · Scala - Essentials And Deep Dive · Introducing Traits And Oops In Scala · Functional Programming In Scala · Spark And Big Data · Understanding RDDs · Spark Sql · Advanced Spark Concepts And Project Discussion It is Scheduled for 24th September - 30th October,[masked]:30 AM to 01:30 PM (EDT); (http://unbouncepages.com/apachespark-and-scala-class/) __________Saturday & Sunday (Week-End) Batch__________ In case you can't make it sign-up anyway, we'll send you the recording. This promises to be an extremely enriching session and we hope you can make it -Register Now (http://unbouncepages.com/apachespark-and-scala-class/) If you have any additional questions or require further clarification, please, do not hesitate to Call me on [masked] or send me an Email ([masked]). Cheers !