In my previous post, I explained the importance of feature encoding and how to do it in python using scikit-learn. In this post, we are going to talk about another component of the preprocessing step in applying machine learning models which is feature scaling. Very rarely would you be dealing with features that share the same scale. What do I mean by that? For example, let’s look at the famous wine dataset which can be found here. This dataset contains several features such as alcohol content, malic acid and color intensity which describe a type of wine. Focusing on just these three features, we can see that they do not share same scale. Alcohol content is measured in alcohol/volume where as malic acid is measured in g/l.
Why is feature scaling important?
If we were to leave the features as they are and feed them to a machine learning algorithm, we may get incorrect predictions. This is because most algorithms such as SVM, K-nearest neighbors, and logistic regression expect features to be scaled. If the features are not scaled, your machine learning algorithm might assign increased weight to one feature compared to another solely based on its value.