What we're about
This is a meetup for Seattle / Eastside users of Spark (www.spark-project.org), the high-speed Scala-based cluster programming framework. We'll be rotating among locations in Seattle and Bellevue.
We'll also discuss other Spark and AI projects including spark-packages (http://spark-packages.org/), MLflow, TensorFlow, Keras, etc.
We will include introductions to the various Spark+AI features, case studies from current users, best practices for deployment and tuning, and future development plans.
Follow us on Twitter at @SparkAISeattle
Upcoming events (4)
Event Program:
✅ 6:00 - 6:30 - Eat, beverages, and network!
✅ 6:30 - 7:15 - Talk: Irena Kushner - Migrating Bespoke ML to Horovod & Databricks
✅ 7:15 - 8:00 - Simon Whiteley + Denny Lee: Ask Us Anything
✅ 8:00 - 8:30 - Eat, drink, and network some more!
Part 1️⃣
Scaling machine learning pipelines can be challenging, especially for bespoke models that don't fit neatly into distributed frameworks like SparkML. Last year, our team faced these challenges with a highly complex transformer neural network, which struggled to operate at scale. We migrated the model pipeline from a single-node instance to distributed tooling on Databricks and Horovod, unlocking new levels of scalability and observability.
This talk will dive into the challenges we faced, the tools and techniques we used to overcome these challenges, and the benefits of parallelizing machine learning workflows. You'll leave with an understanding of how to scale your own model pipelines, regardless of their complexity.
Part 2️⃣
Join Simon Whiteley and Denny Lee for a LIVE "Simon and Denny - Ask Us Anything!" where they will answer your data engineering questions from building a data platform to ingestion to ETL to analytics. With their background in SQL Server and BI to Apache Spark and Delta Lake - we want to show you how to build your own lakehouse.
As this session is interactive, come prepared to ask questions all throughout the session! Be prepared for another geeky, trans-Atlantic event from two data nerds.
⭐ Meet the Speakers ⭐
Irena Kushner is a machine learning engineer at Accolade. Over several years on the data science team she has collaborated with applied scientists, clinicians, and other engineering teams to bring ML models to production. Her work focuses on operationalizing ML pipelines at scale.
Simon Whiteley is a Microsoft Data Platform MVP, Databricks Nerd, Cloud Herder, and one of those “London People”. Simon runs Advancing Analytics, a UK data consultancy as their Director of Engineering, hosts the Advancing Spark youtube channel and organises the Microsoft Data London meetup. He spends most of his time working with companies to apply "big data" thinking to traditional analytics problems, whether through cloud architecture, spark development or Azure wrangling. When not tinkering with tech, Simon is a death-dodging London cyclist, a sampler of craft beers, an avid foodie and a lover of all things nerdy.
Denny Lee is a Developer Advocate at Databricks. He is a hands-on distributed systems and data sciences engineer with extensive experience developing internet-scale infrastructure, data platforms, and predictive analytics systems for both on-premise and cloud environments. He also has a Masters of Biomedical Informatics from Oregon Health and Sciences University and has architected and implemented powerful data solutions for enterprise Healthcare customers. His current technical focuses include Distributed Systems, Apache Spark, Deep Learning, Machine Learning, and Genomics.
- Denny L.
- bardia
- Jim
- 21 attendees
Come and celebrate with us! The PyData 2023 Pre-Conference meetup PARTY!
🥳🧑🎤👩🎤 Cohost: PyLadies Seattle + Women Techmakers Seattle + Seattle Spark + AI
Event program:
6:00 - 6:30 - Eat, beverages and network. Sponsor: Databricks 🎉 Delta Lake Anyscale
6:30 - 6:45 - Announcements
6:45 - 7:30 - Talk: Chengyin Eng - Introducing Spark Connect - The Power of Apache Spark, Everywhere
7:30 - 8:15 - Panel Discussion - The past, present and future of open source. Denny Lee and Jules Damji - Learning Spark book authors 🌎📚💕 Host: Eloisa Elias T 🤠 WTM Ambassador
8:15 - 8:30 - Raffle
8:30 - 8:40 - Pie bar! 💯🎉🥧
Chengyin Eng is a Senior Data Scientist on the Machine Learning Practice team at Databricks. She is experienced in developing and productionizing scalable machine learning solutions for cross-functional clients. She regularly collaborates with the product and engineering team to help shape the direction of MLOps products on Databricks.
She also teaches ML in production and deep learning courses. She has spoken at Open Data Science Conference, Data and AI Summit, Women in Data Science, etc. Outside of work, she enjoys connecting with friends and watching crime mystery films.
Jules S. Damji is a lead developer advocate with the Ray team at Anyscale Inc, an MLflow contributor, and co-author of Learning Spark, 2nd Edition. He is a hands-on developer with over 25 years of experience. He has worked at leading companies, such as Sun Microsystems, Netscape, @Home, Opsware/LoudCloud, VeriSign, ProQuest, Hortonworks, and Databricks, building large-scale distributed systems. He holds a B.Sc and M.Sc in computer science (from Oregon State University and Cal State, Chico, respectively) and an MA in political advocacy and communication (from Johns Hopkins University).
============================================================
💚💙 PyData Seattle 2023 3-day conference, April 26 - 28 Hosted by Microsoft
Join us as a volunteer at PyData Seattle conference 2023 🚀 Volunteers will receive free admission to the conference.
Everyone is welcome to apply. 🌷🎉 HERE 💟
Please complete the sign up form by the 26th of March, 2023
🌎 💕 World class speakers 🤠 We are excited that this large event is returning to Puget Sound
💖 🚀 Keynote Speakers
Katrina Riehl - NumFOCUS Board of Directors President & Head of Streamlit Data Team at Snowflake
Travis Oliphant - CEO at OpenTeams LLC and Quansight. Creator of NumPy, SciPy, and Numba.
Peter Wang - CEO at Anaconda, Inc.
Holden Karau - Co-author of O'Reilly's Learning Spark and High Performance Spark.
More information about the conference and to find tickets you can visit here 💚💙
10% members discount HERE: 💟
NumFOCUS is a 501(c)(3) nonprofit All proceeds from the PyData Seattle 2023 Conference benefit NumFOCUS public charity. Proceeds are used for the continued development of open-source tools used by data scientists and the advancement of the NumFOCUS mission to promote sustainable high-level programming languages, open code development, and reproducible scientific research. NumFOCUS supports and promotes world-class, innovative, open source scientific computing projects including: Pandas, Numpy, Sympy, IPython, Jupyter, Matplotlib, R and Julia.
💛 💜 Become a NumFOCUS Member!
Help sustain the open source data stack by becoming a NumFOCUS member
- Kaylea
- Jo
- Edward R.
- 48 attendees
The team at NumFOCUS will be hosting PyData Seattle conference 2023 April 26-28 at the Microsoft Conference Center in Redmond a 1000 in-person attendees https://pydata.org/seattle2023/ We are excited that this event is returning to Puget Sound after the pandemic caused a longer hiatus than any of us hoped for.
10% members discount HERE: 💟
Join us as a volunteer at PyData Seattle conference 2023 🚀 Volunteers will receive free admission to the conference.
Everyone is welcome to apply. 🌷🎉 HERE 💟
Please complete the sign up form by the 26th of March, 2023
🌼 PyData is organized by, and all proceeds benefit NumFOCUS, a US 501(c)(3) public charity. Proceeds are used for the continued development of open-source tools used by data scientists and the advancement of the NumFOCUS mission to promote sustainable high-level programming languages, open code development, and reproducible scientific research. NumFOCUS supports and promotes world-class, innovative, open source scientific computing projects including: Pandas, Numpy, Sympy, IPython, Jupyter, Matplotlib, R and Julia.
More information about the conference and to find tickets you can visit https://pydata.org/seattle2023/.
We would love to see representation from the Open Source Seattle Spark+AI community and welcome you all to submit a CFP or join the event as an attendee. If you have any questions please address pydataseattle @ (gmail).com. Thank you for sharing this information and helping us spread the word! 🌎 💕
- Denny L.
- Eloisa Elias T - e.
- 2 attendees
Everyone is welcome to apply. 🌷🎉 HERE 💟
Volunteers will receive free admission to the conference.
Example Volunteer Duties Include:
- Check-in at registration desk
- Introduce speakers
- Ensure talks start and end on time
- Moderate Q&A
- Answer or appropriately redirect questions from attendees and speakers
- Appropriately redirect any Code of Conduct complaints
Please complete the sign up form by the 26th of March, 2023.
Note: No travel or accommodations are covered.
---------------------------------------------
💚💙 🎉PyData Seattle 2023 3-day conference, April 26 - 28 Hosted by Microsoft
10% members discount HERE: 💟
🌎 💕
PyData brings together both users and developers of data analysis tools to share ideas and learn from each other. The goals are to provide data science enthusiasts, across various domains, a place to discuss how best to apply languages and tools to the challenges of data management, processing, analytics, and visualization.
🌷 🌸 🌼 We are hosting a diversity panel event. Panelists will represent a diverse array of experience and areas of practice in the field of data science, ML and AI.
💚💙 Sponsorship
Contact us at [masked] or fill out this form here
NumFOCUS is a 501(c)(3) nonprofit that supports and promotes world-class, innovative, open source scientific computing projects including: Pandas, Numpy, Sympy, IPython, Jupyter, Matplotlib and Julia.
The mission of NumFOCUS is to promote sustainable high-level programming languages, open code development, and reproducible scientific research. We accomplish this mission through our educational programs and events as well as through fiscal sponsorship of open source scientific computing projects. We aim to increase collaboration and communication within the data science and scientific computing community.
💛 💜 Become a NumFOCUS Member!
Help sustain the open source data stack by becoming a NumFOCUS member
NumFOCUS envisions an inclusive scientific and research community that utilizes actively supported open source software to make impactful discoveries for a better world.
💖 🚀 PyData conference videos
##
- Denny L.
- Eloisa Elias T - e.
- 2 attendees
Past events (99)
- Denny L.
- Jasmine W.
- Aaron A
- 55 attendees