#SDBigData Meetup #26


Details
Join us again for your favorite Big Data meetup! Food and drink will be provided by our executive meetup sponsor - DataStax.
We hope that you can join us - bring a colleague!
Agenda as follows:
5:00-5:45pm Socializing over food and adult beverages
5:45-6:00pm Welcome
6:00-6:45pm Classification and Clustering Algorithms with Wine and Chocolate
6:45-7:00pm Break
7:00-7:45pm Ingest Data from Relational Databases to Cassandra with StreamSets
7:45-8:00pm Wrap-up
Featured discussions are for meetup #26 include:
• Can machine learning determine a wine’s region and quality? Can machine learning determine what makes chocolate delicious? Short answer, yes it can! This talk will focus on using classification and clustering algorithms to do analytics at scale using Apache Cassandra and Apache Spark. Publicly available datasets, Jupyter notebooks, Pyspark, and DataStax Analytics will power this talk and live demo.
Our speaker will be Amanda Moran -
Amanda is a Developer Advocate for DataStax. Her passion is bridging the gap between customers and engineering! Amanda graduated from Santa Clara University in 2012 with a Master’s in Computer Science, she also has a Bachelor's of Science In Biology from the University of Washington. She is based in the Bay Area and has worked for HP, Lockheed Martin, Teradata, and an Apache Trafodion startup Esgyn. Amanda is an Apache Committer and member of the PMC for Apache Trafodion. She has worked on customer poc’s, executive demos, AWS deployments, python coding, data science workshops, conferences, linux/hadoop administration, and scripting -- a little bit of everything! In her spare time, she loves running, hanging out with her dog, and finding reasons to go to Disneyland.
Linkedin: https://www.linkedin.com/in/amanda-kay-moran/
Twitter: https://twitter.com/AmandaDataStax?lang=en
• "Ingest Data from Relational Databases to Cassandra with StreamSets" by Pat Patterson
So you need to migrate some data from an existing relational database (RDBMS) to Apache Cassandra™. But How? Or, how about ingesting from an RDBMS to a Kerberized DataStax cluster? What about a one-time batch load of historical data vs streaming changes? You could write and deploy some custom code, utilizing a framework like Apache Spark. Sometimes that makes sense, but often it requires significant time and resources. So what are the alternatives...? In this talk, we'll explore how you can use the open source StreamSets Data Collector for migrating from an existing RDBMS to DataStax or Apache Cassandra™.
Our speaker will be Pat Patterson -
Pat Patterson has been working with Internet technologies since 1997, building software and communities at Sun Microsystems, Huawei, Salesforce, and StreamSets. At Sun, Pat was the community lead for the OpenSSO open source project, while at Huawei he developed cloud storage infrastructure software. As a developer evangelist at Salesforce, Pat focused on identity, integration and the Internet of Things. Now Director of Evangelism at StreamSets, Pat helps enterprises unlock the value in their data.

#SDBigData Meetup #26