Cassandra & The Multi-Cloud/ Ingest Data from RDBMS to Cassandra w/ StreamSets


Details
Apache Cassandra and The Multi-Cloud by Amanda Moran
Distributed Databases and more specifically cloud-native databases were created to face many of the issues with a traditional relational database. Having a low latency and highly available database is the key to preventing a multitude of issues. This talk will focus on what distributed databases provide and why it’s important. This talk will also focus on how cloud-native databases like Apache Cassandra are the perfect match for multi-cloud architectures, and why multi-cloud is important.
Amanda is a Developer Advocate for DataStax. Her passion is bridging the gap between customers and engineering! Amanda graduated from Santa Clara University in 2012 with a Master’s in Computer Science, she also has a Bachelor's of Science In Biology from the University of Washington. She is based in the Bay Area and has worked for HP, Lockheed Martin, Teradata, and an Apache Trafodion startup Esgyn. Amanda is an Apache Committer and member of the PMC for Apache Trafodion. She has worked on customer poc’s, executive demos, AWS deployments, python coding, data science workshops, conferences, linux/hadoop administration, and scripting -- a little bit of everything! In her spare time, she loves running, hanging out with her dog, and finding reasons to go to Disneyland.
"Ingest Data from Relational Databases to Cassandra with StreamSets" by Pat Patterson
So you need to migrate some data from an existing relational database (RDBMS) to Apache Cassandra™. But How? Or, how about ingesting from an RDBMS to a Kerberized DataStax cluster? What about a one-time batch load of historical data vs streaming changes? You could write and deploy some custom code, utilizing a framework like Apache Spark. Sometimes that makes sense, but often it requires significant time and resources. So what are the alternatives...? In this talk, we'll explore how you can use the open source StreamSets Data Collector for migrating from an existing RDBMS to DataStax or Apache Cassandra™.
Pat Patterson has been working with Internet technologies since 1997, building software and communities at Sun Microsystems, Huawei, Salesforce, and StreamSets. At Sun, Pat was the community lead for the OpenSSO open source project, while at Huawei he developed cloud storage infrastructure software. As a developer evangelist at Salesforce, Pat focused on identity, integration and the Internet of Things. Now Director of Evangelism at StreamSets, Pat helps enterprises unlock the value in their data.

Cassandra & The Multi-Cloud/ Ingest Data from RDBMS to Cassandra w/ StreamSets