Friday, January 25, 2019


Standard Deviation

The concept of standard deviation was first introduced by Karl Pearson in 1893.

Let’s first understand the significance of standard deviation:

When we look at the data for a population often the first thing we do is look at the mean. But even if we know that the distribution is perfectly normal, the mean isn't enough to tell us what we know to understand what the mean is telling us about the population. We also need to know something about how the data is spread out around the mean - that is, how wide the bell curve is around the mean. Yes, there is the basic measure comes i.e standard deviation.

Standard deviation is a widely used measure of variability or measure of dispersion. It shows how much variation or "dispersion" exists from the mean or expected value.

A low standard deviation means that most of the numbers are very close to mean. A high standard deviation means that the numbers are spread out.

One can also say a smaller standard deviation means the variation is small in the data and a large standard deviation means the variation is large in the data.







Fig. Standard Deviation




A step-by-step method for calculating the standard deviation:

(1) Find out the mean of the data set.
(2) Subtract this mean from each data point to find out deviation from the mean. It could be either positive or negative.
(3) Square up these deviations to find out squared deviation. Naturally, squared deviations will be all positive.
(4) Find out the mean of squared deviations. This is called variance.
(5) Find out the square root of the variance. That's the standard deviation.


Different examples to understand the standard deviation in an easy manner:

Some examples in which standard deviation might help to understand the value of the data:

1. A class of students took a math test. Their teacher found that the mean score on the test was 85%. The teacher then calculated the standard deviation of the other test scores and found a very small standard deviation which suggested that most students scored very close to 85%.

2. A market researcher is analyzing the results of a recent customer survey. He wants to have some measure of the reliability of the answers received in the survey in order to predict how a larger group of people might answer the same questions. A low standard deviation shows that the answers are very projectable to a larger group of people.

3. An employer wants to determine if the salaries in one department seem fair for all employees, or if there is a great disparity. He finds the average of the salaries in that department and then calculates the variance, and then the standard deviation. The employer finds that the standard deviation is slightly higher than he expected, so he examines the data further and finds that while most employees fall within a similar pay bracket, three loyal employees who have been in the department for 20 years or more, far longer than the others, are making far more due to their longevity with the company. Doing the analysis helped the employer to understand the range of salaries of the people in the department.


Let’s understand how to Standardize data?

The data is standardized by subtracting the mean and then dividing by the standard deviation which ensures that all of your variables have mean zero and variance/standard deviation of 1. Standardization of the data is very much important as we can compare them on a similar scale.


I hope you enjoyed this post. The tutorial is very helpful to get the overall idea of standard deviation. The tutorial also highlights how to standardize data. Good Luck!