Skip to content

WEBINAR: Data Lakes and Enforcing Data Quality on AWS

Photo of John Raven
Hosted By
John R.
WEBINAR: Data Lakes and Enforcing Data Quality on AWS

Details

I am excited to announce that Ippon Technologies is scheduled to host the first AWS User Group - New Jersey Meetup on Wednesday April 29th, 2020 starting at 5:30pm.

We're doing this inaugural Meetup online, this is our rescheduled event which was originally supposed to be held on March 11th.

The event starts at 5:30pm.

Talk #1 - Data Lakes on AWS

Speaker - Shashi Raina - Partner Solution Architect at Amazon Web Services

Abstract:

Data lakes have become a staple for companies looking to empower their business with data. Depending on the amount of data in your lake, they can be very cumbersome and difficult to manage. AWS has several services that aim to simplify the process of data lake development and hydration. During this talk, AWS Solutions Architect Shashi Raina will discuss the ins and outs of data lakes on AWS. His talk will cover S3, the Lake Formation service, and how can start building a data lake on AWS quickly to empower your business with data-driven decisions.

Talk #2 - Enforcing Data Quality Management with AWS

Speaker - Dan Ferguson, AWS Pro Solution Architect at Ippon Technologies

Abstract:

Data Quality Management is the process of auditing, balancing, and controlling the data moved across an ETL pipeline. Proper data quality management, or DQM, ensures the data in your data lake or your data warehouse matches the data in your source systems. Without quality data, you cannot derive relevant conclusions or build accurate models to help your business. The stateful nature of ETL pipelines makes DQM a tedious and time-consuming endeavor that usually gets deprioritized. It is very easy for teams to focus on the completion of their ETL tasks over the integrity of the data.

Event-driven architectures are the best way to circumvent the short-comings of a stateful system. An event-driven architecture will automatically update relevant parties with information that would otherwise be stateful. Event-driven architectures are also well-suited to decoupling application components. In this talk, we will discuss the key components of an ETL pipeline, the corresponding DQM harness over said pipeline, and how an event-driven architecture can decouple these moving pieces while still enforcing data quality. We will explore these concepts using AWS cloud services and discuss the pros and cons of the sample architectures.

Photo of AWS User Group - New Jersey group
AWS User Group - New Jersey
See more events