Good day to everyone. Hope you are enjoying the summer!
Our upcoming June 10th Meetup will be hosted by Carbonite. A big thank you to Meetup member Jamie Cooley for arranging this.
Food and drink for this event will be sponsored by Extreme Reach.
Please note the address for this event, which is at the Christian Science Plaza, near the Prudential. There's a variety of parking options available, and the Prudential T on the Green Line is directly across the street. Hynes, Back Bay Station, Copley Square are all within 3-4 blocks. (Carbonite is using Wayfair inc. meeting space)
Here is the location: http://goo.gl/maps/D9bl2
Security will be in place so an RSVP is Required for Attendance.
The main topic is Big Data, and includes the following presenters;
* Jaspersoft Corporation
Carbonite has developed a robust platform for online backup that has scaled to support our rapid growth. Our sophisticated cloud storage solution is comprised of thousands of servers responsible for managing more than 100 petabytes of our customers’ critical customer data. To-date we have backed up nearly 300 billion files and continue to add approximately 350,000,000 new files per day. We are now embarked on developing our next generation cloud storage service platform supporting a broad range of new data services and applications, with a plan to scale to more than 500 petabytes.
This session will tell the story of how Carbonite utilized the Amazon Web Services platform to quickly build and deploy a cloud-based ‘Big Data’ solution to answer questions for our Product team regarding the types of information we manage for our customers.
About the presenter: Matt Drayer is an Architect with Carbonite’s Office of the CTO. He specializes in the design and development of highly-scalable distributed systems and service-oriented architectures.
If you're deploying 'Big Data' you are almost guaranteed to use, or at least seriously evaluate, cloud based deployments. When scaling up (or down), the cloud is the natural choice for Big Data solutions due to its ease of deployment, scalability, flexibility, and cost effectiveness. Getting the data into Hadoop, for example, is pretty easy. The big question is how to create meaningful and timely sense of all that data to drive business decisions.
This session shows how to get data out of various Hadoop implementations and into analyst hands. The session is focused on live demonstration using Amazon EC2, Cloudera CDH (HBase and Hive), Cloudera Impala, and Jaspersoft BI for AWS.
- A brief comparison of Hadoop data retrieval methods, including programmatic approaches - Demonstrations of reporting, analytics, and dashboarding, both batch and near-real time
- Jaspersoft BI for AWS pricing model to help you determine if a cloud based BI solution makes sense for your organization
- References and suggestions for free training and tutorials to learn more
About the presenter: Mary Flynn is a long-time Jaspersoft veteran. She has fulfilled various technical marketing roles and was the director of the worldwide sales engineering team. She is currently a technology evangelist.
The Stackdriver Intelligent Monitoring Service can collect more than 10M measurements per day from a customer's application environment.To transform this large set of raw data into meaningful and insightful information, Stackdriver performs a number of different types of data analysis. All of these analyses use AWS services extensively. Raw data is archived in S3 for accessibility and persistence. The bulk of the analysis is performed with Hadoop delivered via AWS Elastic Map/Reduce (EMR). Some results are recorded in our EC2-hosted Cassandra cluster for the customer to consume via the web app. Other results are pushed to the customer proactively using the AWS Simple Notification Service (SNS) and the Simple Email Service (SES). This talk will describe how we combine the AWS services with our Python-based code base to provide our customers withintelligent analysis of their application infrastructure. We will help others understand how to use similar approaches for analysis in their applications.
Presenter: Patrick Eaton with Stackdriver.