Skip to content

Cloud Control: Efficient Hadoop ETL Processing with 85% Spot Utilization

Cloud Control: Efficient Hadoop ETL Processing with 85% Spot Utilization

Details

Optimizing your use of AWS Spot Instances can save a ton on infrastructure costs, but how much work is it to constantly monitor fluctuating prices? And how to you deal with losing spot nodes in your cluster without it impacting your running jobs? At BloomReach, aggressive spot utilization saves up to 85% within their Big Data ETL environment. And no one at BloomReach is actively monitoring or bidding — it’s all automated.

In this session, you’ll hear from Jorge Rodriguez, Tech Lead in BloomReach’s data platform team, to learn how they set up their ETL environment to take full advantage of Spot Instances, and how they leverage autoscaling with Qubole to get the most out of their Spot usage while simultaneously eliminating the drawbacks that make Spot Instance use so complex. Mike Ruiz, Solution Architect at AWS, will also review best practice and patterns that make the most sense for spot, monitoring utilization for good spot candidates, picking instances types, understanding how availability effects pricing.

Photo of AWS San Francisco | Official Events group
AWS San Francisco | Official Events
See more events
AWS Pop-up Loft
925 Market Street · San Francisco, CA