Skip to content

December edition of MOHUG - early this month!

Photo of Jeff Graham
Hosted By
Jeff G.
December edition of MOHUG - early this month!

Details

Take note that we're meeting early this month! Our usual schedule (last Tue of every even month) is not the best fit for the holiday season, so we're moving it up to December 8th.

We're still meeting at Fuse as normal.

Topics

Knowledge from Noise: Geospatial Analytics at Progressive - Brian Durkin from Progressive

How do you visually analyze trillions of records using only millions of pixels? Progressive needed to solve this big data challenge with Snapshot, its industry leading usage based insurance offering. Learn how data scientists on Progressive’s product research and development team integrated Hadoop, D3, and Tableau into its technology stack to enable quick data exploration and rapid hypothesis testing. See how noisy vehicle telemetry data can lead to unexpected results... and new insights.

Brian Durkin is an innovation strategist in Progressive's Enterprise Architecture Organization. Throughout his eleven years at Progressive he has played many roles, ranging from application developer to enterprise architecture consultant; the common thread being a passion for making data more useful. He is currently part of the product research and development team focusing on geospatial analytics for usage based insurance where he uses technology to power data exploration, ideation, and rapid hypothesis testing on big datasets.

Overview of how running Hadoop in AWS differs from running traditional on-premises clusters. - Erik Swensson from Amazon Web Services

With all the press around AWS building new data centers here in central Ohio, I thought it would be great if we could get it straight from the source. If you've thought about running Hadoop in the cloud, but have a bunch of questions, this is a talk you don't want to miss.

Erik is an experienced cloud solution architect who has been helping companies utilize the cloud to help drive their business for 5+ years. Author of Big Data Analytics Options on AWS (https://d0.awsstatic.com/whitepapers/Big_Data_Analytics_Options_on_AWS.pdf) whitepaper and a few big-data blogs which can be found here (https://blogs.aws.amazon.com/bigdata/blog/author/Erik+Swensson) . Currently a Solution Architect & Manager at AWS.

Kudu: A new storage layer for Hadoop. - Brandon Freeman from Cloudera

Storing data in Hadoop generally means a choice between HDFS and Apache HBase. The former is great for high-speed writes and scans; the latter is ideal for random-access queries -- but you can't get both behaviors at once. The new storage engine built by Cloudera, Kudu will combine the best of both HDFS and HBase in a single package and could make Hadoop into a general-purpose data store with uses far beyond analytics.

This presentation will have a overview of Kudu, the motivations behind creating it, and what’s available as beta today. After the presentation a demonstration of Kudu will be performed showcasing fast analytics on fast data. Additionally, Kudu and Impala have been submitted to enter the Apache Foundation Incubator.

Brandon recently joined Cloudera from Explorys, in Cleveland, OH where he was the Infrastructure Architect for hundreds of Hadoop nodes in production and non-production environments. As one of the Architects, he was responsible for scalability, stability, performance, hardware selection and assessing various technologies for adoption within Explorys. Learn more about Kudu here (http://getkudu.io/).

Photo of MODUG: Mid-Ohio Data User Group group
MODUG: Mid-Ohio Data User Group
See more events
Cardinal Health FUSE
4305 W. Dublin Granville Rd. · Dublin, OH