Standardization vs Normalization

1 min readMay 21, 2022

Feature Scaling is an important step in the data preparation process. When you are preparing your data, the features need to be scaled to the same scale so that there is no bias in any learning algorithms, especially distance based algorithms.

There are two methodologies used for feature scaling-

Standardization

Also known as z-score standardization, Is used to re scale a data set so that it has a mean of 0 and a standard deviation of 1.

x_standard = (xn — x) / s

xn: The nth value in the dataset
x: The sample mean
s: The sample standard deviation

Standardization does not get affected by outliers because there is no predefined range of transformed features.

Normalization

Also known as min-max scaler, is sued to re scale the values into a range of [0,1]. This might be useful in some cases where all parameters need to have the same positive scale. However, the outliers from the data set are lost.

Xnormal=X−Xmin/Xmax−Xmin

Xmin: The min value in the dataset
Xmax: The max value in the dataset
X: The current value in the data set

Standardization vs Normalization

Written by Swagata Ashwani

No responses yet