Imbalanced data in machine learning


Details
Techies, investors, and savvy journalists will espouse the value of BIG DATA. Data is the new oil, they say. Data will lead us to building smarter products and services, they say. Data is $money$, they say.
Data practitioners, however, often hum the tune of “more data, more problems.” This is where Data Science, becomes an indispensable tool to understand and utilize BIG DATA. One of the challenges in Big Data is that we often seek something rare or something anomalous or unusual. Often these rare cases are of high interest for us to know and discover. Finding fraudulent transactions, suspicious network behavior, or malignant tumors are all examples of finding rare but important events. In this talk we will discuss several popular approaches for finding rare events in highly imbalanced data sets. We will demonstrate many of the approaches (such as sampling techniques, SMOTE, and ADASYN) used in practice with the excellent imbalanced-learn Python package that is the part of the scikit-learn contributor libraries. Finally, we will discuss ideas from deep learning for adapting SMOTE with embedded feature representations.
About the speaker:
Mehrdad Yazdani is a machine learning scientist with over 10 years experience developing data driven applications.
Location:
The event is located at the Qualcomm AZ conference center. Parking can be found at the Visitor Parking lot and also along Pacific Heights Blvd.

Canceled
Imbalanced data in machine learning