Data Science Study Session @Dstillery

This is a past event

120 people went

Location visible to members

Details

You are invited to join us for a data science study session. On Monday, February 1th, Susan Sun, data scientist at Thomson Reuters will present "Feature Engineering: An In-Depth Tutorial"

Summary:

Feature engineering is the process of transforming raw data into features (input variables) for machine learning algorithms. Good feature engineering is known and proven to be directly influential in building a successful and predictive model. Models built with comprehensive feature engineering variables are simpler, more flexible to data changes, and more accurate (e.g. Kaggle competitions).

However, like Andrew Ng said:

"Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering."

In this discussion, we will go through basic well defined procedures that have proven to be effective in feature engineering across all disciplines. We will cover the basic steps of: brainstorming features, deciding what features to create, and evaluating feature importance.

Given that feature engineering is as much an art as a science and may differ across different disciplines, time permitting, we will lead roundtable discussions for the audience to participate and share their tips on feature engineering.

Biography:

Susan is a Data Scientist at Thomson Reuters and a Data Science Instructor at General Assembly. Her academic background is a blend of undergraduate level business at University of Pennsylvania and graduate level statistics at Columbia University. She has worked in a variety of fields from consulting to finance, corporate to non-profit. Currently, she is very actively involved in mentoring and teaching Data Science as well as using her analytics skills for pro-bono consulting at DataKind. On the weekends, she can be found hacking away on her laptop in an East Village tea shop.