Skip to content

Making Big Data Easy - Building a Self-Service Data Platform in the Cloud

Photo of mark
Hosted By
mark
Making Big Data Easy - Building a Self-Service Data Platform in the Cloud

Details

Agenda:
Welcome & Introductions: How Sonic Drive-In uses Qubole

The Qubole Story
Getting Started with Big Data Platforms in the Cloud
Architecture and Data Flow
Hadoop, Hive, Spark, Presto, Airflow and more! - choosing the right tool, for the right job
Deep Learning Applications: Example Deep Learning Models for Different Use Cases
Q&A

[Event Overview]

Big Data is all the buzz these days. Data Science is a rapidly growing, high-demand profession. Deep Learning and AI are the new hot topics in tech. Business Intelligence Developers are becoming better known today as Data Engineers. A vast majority of organizations are standardizing around Open Source technologies (and Open Source moves fast!).

Every organization has data that, if analyzed, could yield significant business value. You may already have a Data Warehouse, Data Lake, some large RDBMS or NoSQL instances, or even a ton of massive CSV files. A great starting foundation, but then the question becomes how do we transition to more advanced analytics techniques to satisfy new use case requirements for the business?

So what’s the best way to get started with Big Data (without a big headache and the common pitfalls)? What if you don’t come from a Big Data background? Maybe you’re further along on the spectrum and have a Big Data deployment, but now need to scale and make data accessible to various data consumers across your organization (technical and non-technical). And last but not least, what’s all this hype about Big Data in the Cloud?

In this talk, Qubole will cover why having a cloud-native, self-service Big Data platform is the not only the easiest way to get started with big data, but also the most agile and scalable way to to evolve into a data-driven organization. Big Data and Data Science experts from Qubole will also share and demonstrate how leveraging automation and intelligence capabilities in the public clouds are key best practices to empowering data analysts, data engineers, data scientists and DataOps/DevOps practitioners.

The presentation will flow from getting started, best practices and common pitfalls to avoid, all the way to very advanced analytics with a overview and demo of Deep Learning from Qubole’s Resident Data Scientist.

About the Speakers:

Horia Margarit

Horia is a Resident Data Scientist at Qubole, where his work spans from data science product development, engineering and helping customers build and deploy cutting edge data science use cases such as NLP, Segmentation, Text Analytics and Deep Learning. Horia has over 5 years experience building and deploying machine learning systems to production. He has worked for large corporations in consumer search, leading FinTech startups, and co-founded and ran his own enterprise cloud startup. Horia earned dual bachelor degrees in Cognitive and Computer Science from UC Berkeley, as well as a master's degree in Statistics from Stanford University.

Suraj Bang

Suraj is a Solutions Architect at Qubole where he brings over 13 years of experience in data analytics and engineering to help customers on their big data journey. He has subject matter expertise in building big data applications with Apache Spark and other Open Source technologies such as Apache Zeppelin. Prior to Qubole, Suraj worked as Data Engineering Lead building various big data applications for financial, retail and insurance organizations.

Logan Spangler

Logan is the Account Manager and Big Data Community Programs Lead for the Central & East US at Qubole. He works with customers including Turner Broadcasting, Sonic Drive-In, LL Bean and Gannett on helping their organizations and data teams be more successful and ambitious with their data strategy and use cases.

About Qubole:

Qubole is the leading cloud-agnostic big-data-as-a-service company and pioneer of the industry’s first Autonomous Data Platform. Processing an Exabyte of data every month, the Qubole Data Service (QDS) provides a single platform for ETL, reporting, ad-hoc analysis, stream processing and machine learning, helping data teams at companies such as Lyft, Pinterest, Oracle, Adobe, Expedia and Under Armour be more productive while reducing the costs of their data initiatives. QDS runs on AWS, Microsoft Azure and Oracle Bare Metal Cloud, taking full advantage of the elasticity and scale of the cloud. It also supports the leading open-source engines, including Apache Spark, Hadoop, Presto, Hive and others – all optimized for the cloud.

Qubole was co-founded in 2011 by Ashish Thusoo and Joydeep Sen Sarma, creators of Facebook’s original self-service big data platform and Apache Hive.

Photo of Big Data in Oklahoma City group
Big Data in Oklahoma City
See more events
Star Space 46
1141 W Sheridan Ave · Oklahoma City, OK