Most machine learning algorithms cannot operate on categorical data directly and require input and output variables to be numeric. Categorical features have to be converted to numeric values before being feed into ML models. Encoding these categorical features as binary features or embedding them as a vector of real numbers are a couple of ways to transform categorical variables.
In this talk Yashwanth Dannamaneni, data scientist at C2FO, will give a brief summary of different transformation techniques (label encoding, one hot encoding, feature hashing). Then he will introduce the concepts of embeddings like cat2Vec and embedding layers to transform discrete features into useful continuous vectors.