Online Data Council Event with Dremio & IBM


Details
To kick off our series of online events, we have 2 talks lined up, covering topics which should be most relevant for any Machine Learning and Data Engineering practice today.
To kick us off, Romeo Kienzler will introduce techniques for preserving privacy in Machine Learning, such as federated learning and homomorphic encryption.
Romeo's talk will be followed up by Ryan Murray, who'll be addressing the challenges (and best practices) involved in building data lakes in the cloud using storage layers such as S3 and ADSL.
Should not be missed!
1700-1715: Introduction
1715-1800: Talk 1 + Q&A
1800-1845: Talk 2 + Q&A
--------------------------------------------------------------------------------------
Talk 1: Privacy-preserving Machine Learning
Speaker: Romeo Kienzler, Chief Data Scientist at IBM Center for Open Source Data and AI Technologies (CODAIT)
Abstract:
Data privacy is a huge concern and often prevents ML and AI project from flourishing. In this talk we’ll introduce you to federated learning and homomorphic encryption. After we’ve covered the theoretical aspects we’ll see how they can be used in practice. We conclude with an outlook on the future of these technologies.
Speaker bio:
Romeo Kienzler is Chief Data Scientist at the IBM Center for Open Source Data and AI Technologies (CODAIT) in San Francisco. https://github.com/romeokienzler/me
--------------------------------------------------------------------------------------
Talk 2: Challenges when building a cloud-based data lake
Speaker: Ryan Murray, Principal Consulting Engineer at Dremio
Abstract:
Cloud data lakes have become a key ingredient in the data architecture and their adoption is on the rise. With countless vendors and providers joining the arena, it’s important to consider what the future implications are for each option. And most importantly, how can you set up your organization for long term success and avoid restrictive frameworks that many are trying to migrate away from today?
In this session, Ryan Murray shares how you can architect a sustainable, scalable cloud-based infrastructure for your analytics. He’ll talk about the latest open-source technologies such as Apache Arrow, Apache Iceberg and how they contribute to an open and flexible data lake platform. At the end, he’ll present a short Dremio demo, a next-generation data lake query engine.
Speaker bio:
Ryan Murray is a Principal consulting engineer at Dremio in the professional services organization since July 2019, previously in the financial services industry doing everything from bond trader to data engineering lead. Ryan is a PhD in Theoretical Physics and an active open source contributor who dislikes when data isn't accessible in an organisation. Passionate about making customers successful and self-sufficient. Still one day dreams of winning the Stanley Cup.

Online Data Council Event with Dremio & IBM