Skip to content

Building a Memory-Centric Data Pipeline with Apache Spark

Photo of Nic Raboy
Hosted By
Nic R.
Building a Memory-Centric Data Pipeline with Apache Spark

Details

For an operational database, Spark is like Batman's utility belt: it handles a variety of important tasks from data cleanup and migration to analytics and machine learning that make the operational database much more powerful than it would be on its own. In this talk, we describe the Couchbase Spark Connector that lets you easily integrate Spark with Couchbase Server, an open source distributed NoSQL document database that provides low latency data management for large scale, interactive online applications. We'll start with common use cases for Spark and Couchbase, then cover the basics of creating, persisting and consume RDDs and DataFrames from Couchbase's key/value and SQL interfaces.

Photo of Couchbase Silicon Valley group
Couchbase Silicon Valley
See more events
Couchbase
2440 W El Camino Real 101 · Mountain View, CA