What we're about

Data Engineers, Data Scientists, Data Architects - if you build pipelines, engineer features, perform exploratory data analyses, data profiling, apply Data Science models, implement Machine Learning workflows - let's share knowledge and experience.

Upcoming events (1)

Computing distance using Geohash through Spark + Intro to Kedro

Details: QuantumBlack has agreed to sponsor our April Meetup There will be a couple of talks, lots of networking along with pizza and beer! Please bring an official ID to get into the building. ------------------------------------------------------------------------------------------------- Agenda: 6:00pm - 6:45pm Meet and greet - Welcome 6:45pm - 7:45pm - Computing distance using Geohash through Spark - Talk by Yuhao Zhu & Mayur Chougule - Intro to Kedro - A workflow development tool 7:45pm - 8:15pm Networking ------------------------------------------------------------------------------------------------- Title: Computing distance using Geohash through Spark + Intro to Kedro ------------------------------------------------------------------------------------------------- Speaker: Yuhao Zhu & Mayur Chougule & Yetunde Dada ------------------------------------------------------------------------------------------------- Abstract: Computing distance using Geohash through Spark: Geohashing is a method to encode latitude and longitude and grouping nearby points on the globe with varying resolutions. With Geohash, it is easy to find data in some region or do a k-nearest query. In this meetup, we will look at the Spark Geohash pipeline to solve some real-world problems. Intro to Kedro: Spending long hours hunting down data sources, cleaning and building ML models only makes sense if your work can be deployed and achieve value for users. You'll learn how easy it is to apply software engineering principles to your data pipeline using a Python library called Kedro. Kedro is a workflow development tool to help you build data pipelines that are robust, scalable, deployable, reproducible and versioned. We provide a standard approach so that you can: - Spend more time building your data pipeline - Worry less about how to write production-ready code - Standardize the way that your team collaborates across your project - Work more efficiently ------------------------------------------------------------------------------------------------- Speaker Bio: Yuhao Zhu is a Data Engineer at QuantumBlack. He works on building architecture and pipeline for analytics and machine learning projects. Prior to QuantumBlack, Yuhao had worked at Nasdaq. He earned his Master’s Degree in Computational Science from Harvard University and B.S. from the University of Illinois at Urbana-Champaign. Mayur Chougule is a Data Engineer at the QuantumBlack. He is experienced in architecting and building reactive software systems as well as data platform on AWS and Azure. Prior to QuantumBlack, Mayur worked at an Enviance Inc. a Compliance and Risk Management startup and founded a parking management startup. Yetunde Dada is a Product Manager at QuantumBlack. She works with an incredible team to use software to solve problems for data engineers and data scientists. Prior to QuantumBlack, Yetunde worked as a Data Product Manager at Barclays, where she worked on a variety of analytics products and projects. She has an MBA from the Said Business School, University of Oxford and BEng Mechanical Engineering degree from the University of Pretoria. ------------------------------------------------------------------------------------------------- Sponsors: This event is sponsored by QuantumBlack, a McKinsey Company.

Photos (2)

Find us also at