Next Meetup

WEBINAR: ODSC West Online Warm-Up (Free)
We are very excited to our ODSC West next month! As we get closer to the conference, we want to invite you to participate in ODSC West's Online Warm-Up. To access this webinar, please register using the link below: We will features 4 speaker from our upcoming ODSC West conference in San Francisco each of which will present a 30 minute sessions including: Marc Fridson - Balancing ML Accuracy, Interpretability and Costs When Building a Model Sean Patrick Gorman, PhD & Steven Pousty - How to use Satellite Imagery to be a Machine Learning Mantis Shrimp Michael Mahnoney, PhD - Matrix Algorithms at Scale: Randomization and using Alchemist to bridge the Spark-MPI gap 4th speaker TDB Full Agenda Detail Session 1 - Balancing ML Accuracy, Interpretability and Costs When Building a Model (30 Minutes) Speaker: Marc Fridson Bio: Marc Fridson is the Principal Data Scientist of Cross Brand Digital @ Carnival Cruise Line, a Part-Time Lecturer for the Applied Analytics Program Masters Program @ Columbia University and the founder of tech start-up Instant Analytics. He holds a B.S. in Industrial and Systems Engineering from Rutgers University. Abstract: This workshop will use real-world coding examples in Python to demonstrate how to be mindful of these constraints when developing your models. Session 2 - How to use Satellite Imagery to be a Machine Learning Mantis Shrimp (30 Minutes) Speaker: Sean Patrick Gorman, PhD & Steven Pousty Bio: Sean is the Head of Technical Product Management at DigitalGlobe helping build GBDX and next generation machine learning tools for satellite imagery. Sean received his PhD from George Mason University as the Provost's High Potential Research Candidate, Fisher Prize winner and an INFORMS Dissertation Prize recipient Steve is the Developer Relations lead for DIgitalGlobe. He goes around and shows off all the great work the DigitalGlobe engineers do. Steve has a Ph.D. in Ecology from University of Connecticut. He likes building interesting applications and helping developers and data scientists do more with spatial data Abstract: In this session we are going to start by showing you how satellite imagery actually allows you to “see” in more bands of color than the mantis (how about 26 bands) – each band is a massive amount of data about the earth. Then we will show you how you can work with this data in Jupyter notebooks to extract all sorts of information about the world. Finally, we will wrap up with how to make ML models using this data, extract features we care about, and then run it through a cloud-based processing model. Session 3 - Matrix Algorithms at Scale: Randomization and using Alchemist to bridge the Spark-MPI gap (30 Minutes) Speaker: Michael Mahoney Bio: Michael Mahoney is at the University of California at Berkeley in the Department of Statistics and at the International Computer Science Institute (ICSI). He works on algorithmic and statistical aspects of modern large-scale data analysis. He received him PhD from Yale University with a dissertation in computational statistical mechanics. Abstract: In this session we will describe some of the underlying randomized linear algebra techniques. Finally, we'll describe Alchemist, a system for interfacing between Spark and existing MPI libraries that is designed to address this performance gap. The libraries can be called from a Spark application with little effort, and we illustrate how the resulting system leads to efficient and scalable performance on large datasets. We describe use cases from scientific data analysis that motivated the development of Alchemist and that benefit from this system. We'll also describe related work on communication-avoiding machine learning, optimization-based methods that can call these algorithms, and extending Alchemist to provide an ipython notebook <=> MPI interface.

Needs a location

Upcoming Meetups

Past Meetups (19)

What we're about

#ODSC brings together the open source and data science communities with the goal of helping its members learn, connect and grow.

The focus of this Meetup group is to allow #ODSC to work with Meetup groups, non-profits, and other organizations to present informative lectures, workshops, code sprints and networking events to help grow the use of open source languages and tools within the data science and data-centric community. As such, our specific goals are:

1. Build a collaborative group to work with other Meetup groups, non-profits, and other organizations.
2. Promote the use of open source languages and tools amongst data scientists and others.
3. Host educational workshops.
4. Spread awareness of new open source languages and tools that can be used in data science.
5. Contribute back to the open source community.

Who is this meetup for?
• Data engineers, analysts, scientists, and other practitioners
• R, Python and other software engineers who work with data or want to learn
• Data visualization developers and designers
• Non-technical team leads, executives, and other decision makers from data centric startups and large companies looking to utilize open source tools

How can you get involved?
• Attend events, network and precipitate!
• Give a talk or workshop that meets our goals
• Volunteer to help the group (social media, website, blogging)
• Provide us with a venue
• Sponsor food and drinks

Members (422)

Photos (24)