Cassandra Data Maintenance with Spark


Details
For this meetup we are excited to be joined by Kiyu Gabriel, Managing Principal Architect at DataStax.
What You'll Learn At This Meetup:
Most people hear "Spark" and think "Analytics". But the ability of Spark to efficiently distribute and manage a full-table traversal while functionally transforming the data make it perfectly suited to executing "Big Data" maintenance jobs. For example:
• Remove records that meet certain complex criteria - and related records
• Synchronize/move content between tables
• Remove duplicatesRoll cold records to alternate storage or slower media
• Perform complex backup and recovery (e.g. multi-tenant or complex criteria)Perform Bulk changes
• Verify Data Integrity (validate that data meets specified signatures)
• Calculate metadata statisticsMove / extract / transform data incrementally
We'll show some examples and discuss implementation caveats.
About Kiyu Gabriel:
Kiyu Gabriel has over 20 years of experience in development and operations working with Fortune 100 companies, government and startups. He is currently a Managing Principal Architect at DataStax. He and his team provide consulting services to many of the largest users of Cassandra.

Cassandra Data Maintenance with Spark