
What we’re about
Meet other local Python Programming Language enthusiasts!
Upcoming events (2)
See all- NYC Python x PyData NYC: 🗲 Talk Night at Datadog! 🗲Datadog, New York, NY
Join us for an evening of technical presentations featuring two exciting talks on simulation and database observability!
📅 Date: Tuesday, October 1, 2025
🕕 Time: 6:00 PM - 8:00 PM
📍 Venue: Datadog NYC Office
🎥 Recording: This event will be recorded, including Q&AAgenda:
🎤 Talk 1: Observability-Based Index Recommendations - Alex Weisberger
Alex, a full stack software engineer at Datadog, will discuss database observability and index optimization techniques.
Speaker Bio - Alex Weisberger:
Alex is a full stack software engineer at Datadog with experience across diverse environments including trucking navigation, sheet music recognition, karaoke apps, payment processing, commercial real estate SaaS, and database observability. Passionate about correctness and reliability, with expertise in TLA+, property-based testing, and discrete event simulation.🎤 Talk 2: Persona Based Evaluation of Search Systems - Uri Goren
Search evaluation is notoriously complex, with even minor tweaks to hyperparameters or embeddings creating widespread ripple effects across retrieved results. Mitigating this inherent “instability” in search algorithm changes has long been a challenge. Traditional approaches, such as composing test cases, offer a degree of control and consistency. However, writing and maintaining test cases is a painstaking task, particularly in dynamic environments where catalogs are constantly updated with new items. Our proposed method introduces a groundbreaking solution: leveraging user modeling inspired by the “LLM as a judge” paradigm to automate query and result-set generation. This approach dynamically creates realistic query-result pairs by simulating diverse user personas, each designed to evaluate different modalities (image, video, audio) within the catalog under varying prompts.
The innovation lies in the adaptability and efficiency of the system:
- Dynamic Test Coverage: Personas adapt to catalog changes, ensuring that new items are immediately evaluated without manual intervention.
- Multi-Modality Testing: By supporting multiple content types, the system mirrors the diversity of modern search use cases.
- Fully Automated Pipeline: The process eliminates the need for manually written test cases, reducing overhead and accelerating iteration cycles.
This talk will provide an in-depth look at the methodology, the benefits of dynamic user modeling, and real-world results from applying this system. Attendees will leave with actionable insights into transforming their search evaluation strategies, unlocking new levels of stability and precision in their algorithms.
***
⚠️ Registration: Please register here by September 30th at 12:00 PM as we need to give the guest list to security in advance. Please make sure you are using your real name (check your ID) to speed up the process of getting in.
This event is open to all levels. Newcomers and beginners are welcome. All NYC Python events are governed by our Code of Conduct.
- Git for Data: How Table Formats Unify Software and Data DevelopmentNeeds location
## Details
Join PyData NYC at 125 W 25th St (Cockroach Labs) on October 15th at 6:00 pm for a talk night with Jacopo Tagliabue and Ciro Greco from Bauplan. Please sign up with your full official name and bring a government-issued ID.
🍕 Pizza and drinks sponsored by Bauplan and venue hosted by Cockroach Labs - thank you!Agenda:
Git for Data:
Distributed version control systems - such as Git - unlock software development in multi-player mode: devs can safely work over the same code base, with standard (albeit perhaps not user-friendly!) abstractions for snapshotting, time-travel, and branching. Data folks have rarely been so lucky, as their projects crucially depend on data, whose life-cycle management is often cumbersome and custom. In this talk, we present open formats - such as Apache Iceberg - to practitioners with limited to zero exposure to modern cloud infrastructure. In particular, we show how moving from datasets to tables unlocks a similar multi-player mode when building data pipelines, with equivalent abstractions for snapshotting, time-travel, branching, and a unified backbone for pipelines, data science, and AI use cases.## Speaker
Jacopo Tagliabue/Ciro Greco
Jacopo Tagliabue is the co-founder and CTO of Bauplan. Educated in several acronyms across the globe (UNISR, SFI, MIT), Jacopo was co-founder and CTO of Tooso, an AI startup acquired by TSX: CVO in 2019. He led Coveo's AI from scale-up to IPO, and built out Coveo Labs, a prolific R&D practice whose libraries, models, and datasets have garnered tens of millions of downloads. When not busy building products, he teaches MLSys at NYU and explores topics at the intersection of data, infrastructure, and AI. In previous lives, he managed to get a Ph.D., do sciency things for a pro basketball team, and simulate a pre-Columbian civilization.
Ciro Greco is co-founder and CEO at Bauplan, a serverless computing platform for complex data workloads. Formerly, he was the founder of Tooso, an NLP startup based in San Francisco. Tooso was acquired by Coveo in 2019, and Ciro was in the management team that brought Coveo to IPO in 2021. In a previous life, he got a PhD in Neuroscience at Milan-Bicocca, a postdoctoral fellowship at Ghent University, and he was a visiting scientist at MIT.⚠️ Registration: Please RSVP here if you would like to attend: https://www.meetup.com/pydatanyc/events/310919622
All PyData NYC events are governed by the NumFOCUS Code of Conduct.
Not open