

What we’re about
PyData Pittsburgh is a community for data scientists, machine learning practitioners, and all professionals, students, researchers, and enthusiasts working with Python and data in Pittsburgh. Pittsburgh is an emerging tech hub, with world-class research universities, outposts of major technology companies, a dynamic ecosystem of homegrown startups, and a burgeoning robotics sector. Let's connect these dots to share ideas, learn from each other, and grow the local technology community.
Our members include researchers and tech professionals with decades of experience, novices who have yet to write their first line of code, and everyone in between. If you're interested in learning more about amazing, cutting-edge work happening with Python, data, and related technologies in Pittsburgh, you're in the right place, and you'll find a welcoming, supportive community of like-minded folks.
Have an idea for a future PyData Pittsburgh event? Fill out our Call for Proposals form and a member of our organizing team will get back to you!
Meetup is the primary place we post our events, but you can also find and connect with us on:
- Email & Web: https://news.pypgh.org
- Mastodon: https://pypgh.org/mastodon
- X/Twitter: https://pypgh.org/twitter
- LinkedIn: https://pypgh.org/linkedin
---
PyData Pittsburgh is also a node in the larger PyData network. PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
The PyData Code of Conduct governs this meetup. To discuss any issues or concerns relating to the code of conduct or the behavior of anyone at a PyData meetup, please contact the local group organizers (message us on the meetup page). Please also submit a report of any potential Code of Conduct violation directly to NumFOCUS. Thank you for helping us to maintain a welcoming and friendly PyData community!
---
Need to get in touch with the PyData Pittsburgh organizing team? You can reach us at organizers@pypgh.org.
Sponsors
See allUpcoming events (1)
See all- PyCon 2025 Special Event: Hometown Heroes Hatchery ProgramDavid L. Lawrence Convention Center, Pittsburgh, PA
PyCon US 2025 is coming to Pittsburgh this May 14–22, and PyData Pittsburgh is thrilled to be part of it! We’re hosting the Hometown Heroes Hatchery track on Saturday, May 17—a half-day event inside the conference celebrating the incredible work of Python developers, researchers, educators, and technologists from across our city. As part of PyCon’s Hatchery initiative, this track will feature presentations and lightning talks that highlight the creativity and impact of Pittsburgh’s Python community.
If you're attending PyCon US 2025, we invite the PyData Pittsburgh community to join us at the Hometown Heroes track—come connect, engage, and help showcase the strength of our local tech scene.
Please note: you must be registered for PyCon US 2025 to attend this event, and all attendees and speakers are responsible for securing their own tickets. You can find registration details for the Conference here:https://us.pycon.org/2025/attend/information/.
HOMETOWN HEROES HATCHERY PROGRAM - May 17th
TALK SCHEDULE:
Decoding Spatial Biology with Python: Multi-Modal Insights into Breast Cancer Progression
Time: 01:45 PM - 02:15 PM
Speakers: Alex C. Chang, CMU-Pitt (Graduate Student PhD, Computational Biology ) and Brent Schlegel, University of Pittsburgh School of Medicine (Graduate Student PhD, Integrative Systems Biology)Python has rapidly become a cornerstone of scientific computing, computational biology, and bioinformatics due to its ease of use and scalability for handling large datasets—qualities that are critical in today’s “big data” era of clinical and translational research. As computational resources and data collection methods continue to expand, we are now empowered to ask larger and more clinically relevant questions that enable us to dissect complex biological systems with unprecedented detail. However, this surge in data complexity brings new challenges, from the integration of diverse data modalities to the need for sophisticated analytical methods capable of untangling intricate biological signals from background noise. In this talk, we describe how Python not only meets these challenges but also drives innovation through the development of novel bioinformatics tools like CITEgeist—a case study in harnessing Python’s capabilities for multi-modal spatial transcriptomics. Biological datasets often face challenges of high sparsity and noise. CITEgeist harnesses Python’s robust ecosystem to provide an efficient, scalable pipeline that deconvolutes messy spatial signals into actionable, clinically relevant features.
Exploring Energy Burden in Pittsburgh Neighborhoods with Python
Time: 02:30 PM - 03:00 PM
Speakers: Ling Almoubayyed, SmithGroup, Inc. (Project Manager) and Husni Almoubayyed, Carnegie LearningNational-level energy studies consistently find that energy burdens are a significant challenge, and that lower-income neighborhoods sometimes end up paying more for energy in cities including Pittsburgh. Using Python, we were able to extract and analyze data on energy consumption in the City of Pittsburgh, along with real-estate and geographic information system (GIS) data to compare trends in energy usage and burden across Pittsburgh neighborhoods, and across different housing types. We present statistical analyses and Python visualizations describing these trends across different features such as housing price, size, and neighborhood.
Bottling Tesla's Solar: A Solar Dashboard with Python
Time: 03:15 PM - 03:45 PM
Speaker: Christopher Pitstick (Sr. SWE)Tesla's Powerwall/Inverter solar ecosystem are powerful yet notoriously opaque. For home labbers, extracting meaningful data can be daunting—but not impossible. In this talk, I'll share my journey of developing a custom solar dashboard using Grafana and PyPowerwall, navigating the quirks and closed nature of Tesla's ecosystem along the way. The backend is all Python, so I will demo my server code and dashboard to show how I was able find hundreds of kilowatt hours in lost solar production. In this talk, we'll do a deep dive into the way I altered the Python server code to be able to query multiple inverters at the same time with complex iptable rules. This presentation may conclude with the value of installing solar on your home, and how self-monitoring is a critical component of every nerd's arsenal.
Strategies for Eliciting Structured Ouputs from LLMs
Time: 03:50 PM - 03:55 PM
Speaker: Utkarsh Tripathi, Solventum (Machine Learning Engineer)This lightning talk will provide a concise yet comprehensive overview of techniques for extracting structured, predictable outputs from Large Language Models. I will compare and demonstrate multiple state-of-the-art libraries (such as BAML, Instructor, Langchain, SGLang etc. + how they work under the hood), utilize pydantic / dataclass / etc. to get structured outputs. We will explore practical examples of JSON schema enforcement, markdown formatting directives, and template-based approaches that dramatically improve downstream processing capabilities. The presentation will include code snippets and prompt templates that participants can immediately implement in their own projects.
Does Generative AI Know Statistics?
Time: 03:55 PM - 04:00 PM
Speaker: Louis Luangkesorn, Highmark Health (Lead Data Scientist)Generative AI has promise to impact many fields of endeavor. But experience has shown that it often has problems with nuance and context. This talk discusses some experiences using Generative AI as an aid in applied analytics and walks through an example that illustrates working around its weaknesses and taking advantage of its capabilities.
Demystifying How Animal Behavior Affects Disease Spread Using Python
Time: 04:00 PM - 04:05 PM
Speaker: Carolyn Tett, University of Pittsburgh (Research Technician)Not all individuals contribute equally to disease spread. During COVID-19, social distancing reduced transmission for some, while high-contact individuals increased disease spread. Preventative measures for massive disease outbreaks, however, cannot rely solely on data from rare epidemic events. Instead, disease ecologists study animal models to understand how host behavior theoretically drives disease outbreaks. Tracking animal movement and interactions is essential for identifying transmission-relevant behaviors. In lab experiments, video recordings provide an abundance of behavioral data, now efficiently processed through automation, and coding languages like Python enable large-scale data analysis. The Stephenson Lab at the University of Pittsburgh uses Raspberry Pis to autonomously record guppies infected with an ectoparasite. These parasites transmit primarily through instances of close contact between hosts. Through autonomous video recordings, we generated 1,300 hours of footage—equivalent to 54 consecutive days of observation. Given that each video captures six guppies, manually tracking behavior would take tens of billions of days. Instead, animal tracking software reduces this processing time to a mere few months.
The Many-Colored Functions of Async Python
Time: 04:15 PM - 04:45 PM
Speaker: Bryan C. Mills, Duolingo (Senior Software Engineer)You might think of functions in async Python in terms of “synchronous” and “async”, but the possibility of binding objects (such as Locks) to the asyncio event loop adds a whole new dimension to consider. We'll examine six vibrant kinds of functions and how they interact! This talk will examine code examples of how to adapt each kind of function to call other kinds, suggest design patterns that minimize the complexity of dealing with different kinds (such as non-blocking context managers), and examine patterns or libraries to safely synchronize concurrent calls involving multiple kinds of function.
Automated Dependency Inference and its Applications
Time: 05:00 PM - 05:30 PM
Speaker: Jason R. Coombs, Microsoft (Principal Software Engineer)Last summer, I launched the Coherent Software Development System (https://bit.ly/coherent-system) with the principal that one should not have to repeat themselves when developing more than one Python project. One of the key innovations of that system is coherent.deps, a system for deriving package dependencies from the imports that a project or script uses. I'll explore some of the background motivations from Google's monorepo, some prior art at Meta, and some of the approaches that failed (AI-based inference) before going into the details of the implementation (AST parsing, world-readable MongoDB database, Big Table query to PyPI downloads). I'll additionally talk about some of the applications of this generalized library (coherent.build, pip-run), some of the maintenance challenges (expensive query, refresh interval), and possible other applications (on-demand dependency loader).
SPEAKER BIOS:
Alex C. Chang
Alexander Chih-Chieh Chang is a fourth-year MSTP student in the CMU-Pitt Computational Biology Ph.D. Program, mentored by Drs. Lee and Oesterreich. He earned a BS/BA in Chemical and Biomolecular Engineering/Sociology from Johns Hopkins University in 2021. Previously, during his undergraduate research in the lab of Rong Li, Ph.D., he conducted large-scale genomic screens to study proteomic dysregulation and spent a gap year in the lab of Manish Aghi, MD. PhD., studying breast cancer metastasis to the brain. Currently, as a computational biologist and medical student, he coordinates the Hope for OTHERS tissue donation program in the Lee-Oesterreich Lab and computational research projects in breast cancer metastasis and genomic evolution.
Brent Schlegel
Brent Schlegel is a first-year PhD student in Integrative Systems Biology at the University of Pittsburgh School of Medicine, co-mentored by Drs. Adrian Lee and Steffi Oesterreich. He earned his AS in Mathematics and Sciences from CCAC (2019) and a BS in Computational Biology from Pitt (2021). Most recently, he worked as a Bioinformatics Analyst at the UPMC Children’s Hospital of Pittsburgh, where he specialized in the integrative analysis of large, complex biomedical datasets. Now, Brent combines data science, computational modeling, and multi-omic integration to tackle the systems biology of invasive lobular breast cancer, using patient-derived organoid models and leveraging “big data” to uncover hidden patterns and drive innovation in diagnosis and treatment.Ling Almoubayyed
Ling is an experienced architecture and urban designer with extensive project management expertise. Specializing in urban design, planning, community engagement, and spatial analysis, she has successfully led projects ranging from individual buildings to comprehensive urban districts. Ling uses evidence-based design with data gathered through stakeholder engagement to identify the best design solutions to create built environments. She is currently a Project Manager with SmithGroup.
Husni Almoubayyed
Husni Almoubayyed is the Director of AI at Pittsburgh-based education technology company Carnegie Learning. Husni uses machine learning and data science methods to conduct research in education, specifically in topics such as personalization, equity, and predictive analytics. Prior to his work in education technology, Husni acquired a Ph.D. in Astrophysics from Carnegie Mellon University, where he worked on mitigating biases in astronomical data to advance understanding of dark energy. Needless to say, Python is Husni's favorite programming language, and PyCon is one of his favorite events of the year!Christopher Pitstick
Christopher, a passionate software engineer who installed solar panels on his home in 2024, quickly immersed himself in system analysis to optimize performance—expertise that directly inspired this presentation. His programming journey began at age 12 with QBasic, igniting a lifelong passion that led to roles at industry giants including Microsoft, Amazon, and Argo AI before joining his current position at Latitude. Throughout his career, Christopher has mastered multiple programming languages from C++ to Perl and Python, approaching coding both as a profession and personal passion. As a dedicated neurodiversity advocate, he regularly shares his experiences through public speaking engagements, raising awareness and empowering others in the tech community.Utkarsh Tripathi
Utkarsh Tripathi is a Machine Learning Engineer at Solventum, Inc., where he works on Solventum™ Fluency Align™ and Solventum™ Fluency Direct™ : AI-powered clinical documentation tools that leverage conversational and generative AI, along with ambient intelligence, to automate medical documentation. These solutions help reduce administrative work and physician burnout, while improving the overall patient care experience. Utkarsh holds degrees in Electrical Engineering, Chemistry, and Computer Science from BITS Pilani and the University of Chicago.Louis Luangkesorn
Dr. Louis Luangkesorn is a Lead Data Scientist at Highmark Health where he works on projects applying statistical, predictive, operations research, and Generative AI models in use cases involving human resources and healthcare. He has contributed code to Scipy and a book appendix porting a simulation textbook's examples to Simpy.Carolyn Tett
Carolyn is an ecologist that specializes in animal behavior and disease ecology. She works with guppies and their ectoparasites to better understand how host contact rate and physiological status impact disease spread. She captures guppy behaviors on video and uses Python to automate the video processing. Using these outputs, she quantifies guppy social metrics and runs statistical models to predict behavior-mediated parasite spread.Bryan C. Mills
Bryan maintains Python core services at Duolingo, and was formerly a maintainer on the Go project at Google.Jason R. Coombs
Jason's been a passionate contributor to Python and open source software since the 90's, is a core contributor to Python, and maintains hundreds of packages in PyPI.