Enriching Data using PySpark and Hive in a Cloud Environment

Name: Enriching Data using PySpark and Hive in a Cloud Environment
Start: 2021-03-09T12:00:00-06:00
End: 2021-03-09T13:30:00-06:00

Hosted by Future of D. and Nicolas P.

Future of Data: Austin

Details

In this meetup, we’re going to put ourselves in the shoes of an electric car manufacturer that is deploying a recently developed electric motor out into their new cars. In an effort to track and analyze this new expensive motor, we’ll show how we can use PySpark to take data from multiple locations within the company’s data warehouse, stitch the data together, and ultimately create an enriched dataset that can be used to solve both engineering and business challenges. As a bonus, the data engineering platform we’ll use will let us easily monitor our data processes from one centralized location, all within a native cloud environment using the Cloudera Data Platform.
Come join us to see how we’ve linked all these concepts together and hopefully inspire similar solutions of your own!

For a preview of the content we'll be covering, we've got the following resources:

Video:
https://youtu.be/dXu4hZAeI8E

Blog:
https://blog.cloudera.com/next-stop-building-a-data-pipeline-from-edge-to-insight

Tutorial:
https://www.cloudera.com/tutorials/enrich-data-using-cloudera-data-engineering.html?utm_source=mktg-community&utm_medium=meetup

Cloudera Users Page:
https://www.cloudera.com/users.html

Due to the ongoing nature of the new corona virus pandemic, this will be an online event. Use the hyperlink provided to participants upon registration to view and interact with the "live stream".

Future of Data: Austin

Cloudera

Enriching Data using PySpark and Hive in a Cloud Environment

Future of Data: Austin

Details

Sponsors

Cloudera

You may also like