Data movement & transformation between heterogeneous datastores (SQL & NoSQL)


Details
HomeAway Datatools team built and open sourced a product called "DataPull" https://github.com/homeaway/datapull. Driving force behind this product is to enable developers to move data across technologies without having to code and understand nitty-gritty.
Its built on top of Apache Spark and supports data movement/transformation across heterogenous data stores (SQL & NoSQL) as well as Streaming technology (Kafka). Capabilities include
- Self Serve
- No Code Or Bring your own JAR
- Heterogeneous databases (SQLServer, MySql, Postgres, Cassandra, MongoDB, ElasticSearch, InfluxDB and Kafka)
- Schedule data transfers
- Parallel executions
- SparkSQL transformations
- Joins across heterogeneous databases as source
Agenda:
5.30 pm : Food, Drinks and Networking
6.00 pm: Session 1 -Jenkins, Managed and Maintained via AWS
7.00 pm : Session 2 -Data movement & transformation between
heterogeneous datastores (SQL & NoSQL)
8.00- 8.30 pm : Food,Drinks and Networking
Speaker Bio :
Srinivas - Staff Database Engineer at VRBO. He is passionate about distributed systems and BigData. He is the lead developer and contributor for Datapull which is available to download via apache open source license. During free time he plays Cricket and travels around the world.
Please check out https://github.com/homeaway/datapull

Data movement & transformation between heterogeneous datastores (SQL & NoSQL)