ππππ Building Data Orchestration for Big Data Analytics in the Cloud


Details
# πRSVP: Co-host: Seattle Spark AI
π¬ ππΏ Happy 2023 and we're going to kick start it with our first meetup back in Seattle downtown at the Common Room offices in Pioneer Square! Come for great technical content, discussions, and food!
π· πΈ πΌ PyData Seattle meetup is an accessible, community-driven meetup, with novice to advanced level presentations in Data Science/ML/AI/DL
π π Raffle! π**:**
- 2 Ticket to PyData Seattle 2023 π 1000 in-person attendees, 3-day conference, April 26 - 28 Hosted by Microsoft. 1 ticket for a women and 1 ticket for a men [Ticket price > $500] - Sponsor NumFOCUS
- 5 Books! Learning Spark - Sponsor Databricks Denny Lee!
Agenda
- 6pm: Doors Open, Eat, & networking π π
- 6:30pm-7:10pm: Building Data Orchestration for Big Data Analytics in the Cloud by Jasmine Wang and Shouwei Chen from Alluxio
- 7:15pm-7:55pm: Koushik Krishnan - Talk: Notebooks as Functions
- 8:15pm Close up
Session 1: Building Data Orchestration for Big Data Analytics in the Cloud
Abstract:
Originally developed from UC Berkeley AMPLab as research project "Tachyon", Alluxio (www.alluxio.io) implements the worldβs first open-source data orchestration system in the cloud. Alluxio creates a unified access layer for data-driven applications in bigdata and ML, enabling Spark, Presto or TensorFlow and etc to transparently access different external storage systems while actively leveraging in-memory cache to accelerate data access.
In this talk, the speaker will present
- New trends and challenges in the data ecosystem in cloud era
- Effective Data engineering in the cloud world with data orchestration
- Production use cases of using popular stacks like Presto/Alluxio/S3
π Speakers
π·πΈ Jasmine Wang is the Head of Community and DevRel at Alluxio. She is a former national debate champion who turned into a traveling yoga teacher with a strong passion in building teams and being the bridge at early startups in Silicon Valley. Previously, she worked as the Head of Global Talent Acquisition and Operations. Currently she is building the Alluxio open source community, responsible for community, developer relations, developer experience, and cross-community collaborations at Alluxio.
πΈ πΌ Dr. Shouwei Chen is a core maintainer and product manager of open-source Alluxio. Before joining Alluxio, Shouwei received a Ph.D. degree from Rutgers University. Shouweiβs research focuses on the codesign of the memory-centric computing frameworks with in-memory distributed file systems in large-scale environments.
ππ€ Koushik Krishnan is a Site Reliability Engineer at Yugabyte. Talk: Notebooks as Functions
Jupyter notebooks are a wonderful environment to write code for both beginners and experienced individuals. The hard part comes when you want to take your notebook and productionize it. That's where Jupyrest comes to the rescue. Jupyrest is a tool that can turn Jupyter notebooks into HTTP functions. It's a serverless platform for Jupyter notebooks. I created Jupyrest at Microsoft and open sourced it earlier this year. In this talk I'll demonstrate how to use Jupyrest to productionize your Jupyter notebooks.
PyData Seattle is looking for speakers. Many of our members are doing amazing data science with Python tools. We want to hear what you are up to! If you have a presentation of between 10 minutes and 1 hour that you would like to share with our group, please submit a short proposal.
You can propose a talk, workshop or lightning talk for our monthly meetups and TalkNights hosted in Seattle and Bellevue.
Fill in the form and let everyone know about the cool work you are doing: Here π
π π Sponsor PyData Seattle
Host an event or provide some delicious food and snacks for our attendees π¬ ππΏ
Email us at [pydataseattle@gmail.com](https://forms.gle/FYrxFdCQcM3SrQ9V9) or fill out the form here
πΈ Thank you for your support to @NumFOCUS, your participation help us to bring awareness to NumFOCUS a 501(c)(3) nonprofit that supports and promotes world-class, innovative, open source scientific computing projects for Data Science, including: Pandas, Numpy, Sympy, IPython, Jupyter, Matplotlib, Julia and many other cool open source data science projects.
π π Become a NumFOCUS Member!
Help sustain the open source data stack by becoming a NumFOCUS member
NumFOCUS envisions an inclusive scientific and research community that utilizes actively supported open source software to make impactful discoveries for a better world.

Sponsors
ππππ Building Data Orchestration for Big Data Analytics in the Cloud