Skip to content
This event was canceled

Imbalanced data in machine learning

Photo of Derek Schaeffer
Hosted By
Derek S.
Imbalanced data in machine learning

Details

Techies, investors, and savvy journalists will espouse the value of BIG DATA. Data is the new oil, they say. Data will lead us to building smarter products and services, they say. Data is $money$, they say.

Data practitioners, however, often hum the tune of “more data, more problems.” This is where Data Science, becomes an indispensable tool to understand and utilize BIG DATA. One of the challenges in Big Data is that we often seek something rare or something anomalous or unusual. Often these rare cases are of high interest for us to know and discover. Finding fraudulent transactions, suspicious network behavior, or malignant tumors are all examples of finding rare but important events. In this talk we will discuss several popular approaches for finding rare events in highly imbalanced data sets. We will demonstrate many of the approaches (such as sampling techniques, SMOTE, and ADASYN) used in practice with the excellent imbalanced-learn Python package that is the part of the scikit-learn contributor libraries. Finally, we will discuss ideas from deep learning for adapting SMOTE with embedded feature representations.

About the speaker:
Mehrdad Yazdani is a machine learning scientist with over 10 years experience developing data driven applications.

Location:
The event is located at the Qualcomm AZ conference center. Parking can be found at the Visitor Parking lot and also along Pacific Heights Blvd.

Photo of San Diego Data Science & R Users Group group
San Diego Data Science & R Users Group
See more events

Canceled

Qualcomm Bldg AZ Auditorium
10155 Pacific Heights Blvd · san diego, ca