Continuous Data Management for Hadoop and Spark – On-Premise or in the Cloud


Details
Please join us and our awesome presenters at Boston's Liquid Art House on January 12th at 5:30 pm. Refreshments will be provided.
http://photos2.meetupstatic.com/photos/event/6/6/8/0/600_445166240.jpeg
Topic: Continuous Data Management for Hadoop and Spark – On-Premise or in the Cloud.
KEYNOTE SPEAKER:
James Campigli, Chief Product Officer and Co-Founder Wandisco
Jim has over 25 years of software industry experience at both early-stage and public companies. In his current role he is responsible for overseeing WANdisco's product strategy. In his previous role as a founder and chief technology officer (CTO) of Librados, an application integration software provider, Jim was responsible for overall product strategy and product messaging. He was also a member of the management team that led the company’s acquisition by NetManage, Inc. Following its acquisition, Jim joined NetManage as CTO for the Librados products group.
Prior to Librados, he was the vice president of product management for Insevo, a middleware company specializing in enterprise application integration. Jim also held senior product management, product marketing and consulting management positions at BEA Systems and SAP AG.
• Big SQL - Making all of your big data (http://www.ibm.com/software/data/bigdata/) SQL accessible using an optimal execution strategy. Presenters:
Rick Tarro, Jeffrey Carlson - IBM Analytics
Big Data makes it possible to inexpensively store and process petabytes of structured, unstructured and semi-structured data generated at incredible speeds. However, the ultimate benefits of big data are lost if fresh, fast-moving data is not analyzed as it happens. Fast data is about data in motion—immediate response and action.
The collection process for data in motion is essentially the same as data at rest, but the key difference is the analysis occurs in real time as data is generated and captured. However, this analysis has to include the historical context provided by data at rest in order to be meaningful. This requires an enterprise-ready architecture that efficiently handles both data at rest and data in motion with the following components:
- An enterprise grade Big Data platform to support real-time analytics applications without downtime or data loss
- A flexible and agile cloud environment for cost-effective burst-out processing
- A data migration/replication engine that exceeds the most demanding application SLAs.
This meet up will provide an overview of the “best in class” architecture required to harness the benefits of Big Data with “Continuous Data Management for Hadoop and Spark”
BigSQL 4.1. Presenters: Rick Tarro, Jeffrey Carlson, IBM
Big SQL provides ANSI SQL access to data across any system from Hadoop, via JDBC or ODBC - seamlessly whether that data exists in Hadoop or a relational data base. This means that developers familiar with the SQL programming language can access data in Hadoop without having to learn new languages or skills. Big SQL sets a new bar: performance. Benchmark tests indicate that Big SQL executes queries 20 times faster, on average, over Apache Hive 12 with performance improvements ranging up to 70 times faster. It can query and combine data from many data sources, including (but not limited to) DB2 for Linux, Teradata, Oracle, UNIX and Windows database software, IBM PureData System for Analytics.
With Big SQL, all of your big data is SQL accessible. It presents a structured view of your existing data, using an optimal execution strategy, given your available resources.
Parking Information
Revere Hotel at 200 Stuart Street
$10 for 3 hours
- The outdoor Laz parking lot across the street, behind Smith & Wollensky
$7 ($5 off)

Continuous Data Management for Hadoop and Spark – On-Premise or in the Cloud