Skip to content

Feature engineering art and science

Photo of Dave Ingram
Hosted By
Dave I.
Feature engineering art and science

Details

Feature engineering is the process of extracting and creating characteristics for your data. It is described as one of the most important parts of the Machine Learning (ML) process. The effectiveness of your model depends on it. Specifically, a classification model relies on robust feature engineering to find those border lines that separate the clusters of data into different groups. A forecasting model needs good features to predict the future. A language model is highly dependent on a good way of translating words, sentences, and contexts into features.

In this talk, I analyze the art and science of creating good characteristics to tell the story of your data and derive robust models. I split the features into numerical and categorical and show several examples on how to encode, convert, and summarize them to derive new features. I explain embeddings and how they contribute to the “magic” of language models such as ChatGPT and LLama. This talk revolves around practical examples from public datasets in the area of cybersecurity. I present and share a notebook in order to actively participate in the process of creating features. Finally, I discuss the art of creating features based on expertise in a specific field.

The goal of this presentation is to demonstrate what makes ML models “tick” and perform tasks such as predicting the future, discovering bad actors, or generating interesting answers to your questions. Moreover, we delve into “featuristic” trends in the area of feature engineering to inspire creativity and innovation within the community of data enthusiasts.

Join the Charleston Data Science community at our monthly meetup to discuss all things data. We rotate between introductory topics and deeper dives, and like to discuss anything from visualization to specific domains to neural networks and LLMs.

We meet monthly on the second Thursday at the Charleston Learning Center. This is the default message for our monthly meetups, so if you're reading this and have some data science knowledge to share (or want to give yourself the excuse to dive into a new topic so you can teach the group), please reach out.

Agenda:

  • Join us at 5:30 for snacks, drinks, conversation and socializing.
  • 6:00 - Main talk(s)
  • 7:00 - 7:30 - Wrap up and hang out some more
  • 7:30 - ? - After Party*

======================================

"Call for Papers"! .. if you have some knowledge to share with the community, please talk to Dave at the next meetup. We'd love to have 1 or 2 talks per meetup lined up for the coming months.

*After Party? .. The meetup is right next to Revelry Brewing. If folks are interested, we can clean up at the CLC and keep the conversation going over a nice local brew.

Photo of Charleston Data Science group
Charleston Data Science
See more events
CDC Learning Center
4 Conroy St Ste A · Charleston, SC