High availability Hadoop and Apache Cassandra


Details
This month we take a look at recent enhancements to make Hadoop more available as well as having the chance to hear from Patrick McFadin who is out here from the Bay Area to give us an intro to Apache Cassandra. We are really pleased to have WANDisco, EMC and DATASTAX sponsor the beer and pizza.
Availability improvements to Hadoop: Paul Scott-Murphy (WANdisco) and
Danny Elmarji & David Lloyd (EMC)
Current and forthcoming improvements to the open-source Apache Hadoop platform: Simplifying and speeding Hadoop cluster creation and management; HDFS-6469: Coordinated replication of the HDFS namespace; Breaking geographic and availability constraints of Hadoop with high availability and DR capabilities; and Making data services and cluster models flexible by separating storage from the datanodes. These will all be demonstrated through a live demonstration.
Introduction to Apache Cassandra: Patrick McFadin - DATASTAX
Apache Cassandra is quickly becoming the choice for always online, operational databases. We will cover details of why and how this database works. Topics covered: Programming API and query language (CQL); How reads and writes work on a node; How data is replicated in one or more data centers; Use cases popular with Apache Cassandra; and Integration with Apache Spark.

High availability Hadoop and Apache Cassandra