Why is the Normal Distribution Normal?

Damini Vadrevu
5 min readNov 21, 2024

--

The Guassian distribution definitely wasn’t the first distribution to be discovered, so that reason can be crossed out—it wouldn’t even be a good reason to begin with. In order to answer the question in the title, it’s imperative to know a little bit of history first. I’ll try my best not to make it sound boring, as all my history classes were.

The very first distribution to be discovered was the binomial distribution. It came into existence in the 17th century thanks to Jacob Bernoulli, or so he’s credited for. And what was life in the 17th century like? A lot of gambling. Moivre was a mathematician of that era, and much of his work in probability was inspired from gambling, so much so that he wrote a book called “The Doctrine of Chances” which acted as a mere guide for gamblers.

This dude is Abraham De Moivre

Why the Binomial Distribution Was Found First

Because the normal distribution emerged from it. Even though that’s true that’s not why.

The binomial was obviously just the beginning, since it was literally the first distribution, but it laid the groundwork for something even bigger — the discovery of the normal distribution. The binomial arose naturally because early probability problems often involved discrete events with two possible outcomes, such as flipping a coin, drawing cards, or rolling dice. These scenarios could be broken down into trials with binary outcomes (success/failure, heads/tails), making the binomial distribution a suitable model.

De Moivre’s Discovery

Moivre observed that for a small number of trials, the binomial distribution looked discrete and asymmetrical. For example, if we’re flipping coins, specifically 4 times. Then the number of trials becomes n = 4, and the probabilities for different numbers of “heads” or “tails” are spread out like the image below:

Although as the number of trials increases (coin flips), the binomial distribution gets more complex because, well, there are just more possible outcomes to deal with. Think of it like flipping a coin over and over — when you start flipping hundreds or even thousands of times, the distribution of “successes” (like getting heads) starts to look, well, surprisingly orderly. It begins to take on this smooth, symmetrical shape, almost like a neat little bell curve. (Aha!)

As De Moivre played around with increasing trials, he noticed the familiar jagged lines of the binomial started to smooth out as the number of trials grew, transforming the distribution into a gentle curve resembling a normal distribution. So he figured that when you have a large number of trials, you can approximate the whole thing with a continuous curve—what we now call the normal distribution, or bell curve.

De Moivre realized that this wasn’t just a fluke.

This approximation made things easier—it allowed him to simplify complex binomial calculations by using a continuous, bell-shaped curve instead of individual binomial probabilities. This was a groundbreaking observation and one of the earliest documented cases of what we now understand as the central limit theorem. And well, obviously, there’s now a formula to this amazing find-out-ery; it’s called the approximation of the binomial.

Binomial Distribution with Large number of trials approximates to a Normal Distribution

Guassian is literally everywhere

So why did we go through all that history from 1733? Because understanding the roots of the normal distribution and how it grew out of De Moivre’s discovery with the binomial approximation—helps explain why the normal distribution shows up everywhere. The binomial was only one of many cases, where the normal distribution appeared. It seemed to pop up in so many other scenarios with other distributions in the same way. It almost felt intentional, like the universe had handpicked this bell-shaped curve as its favorite pattern. That’s what it was—a pattern. No one set out to call it “normal” in the sense of ordinary; instead, it was given that name because it just kept appearing in so many different areas.

So it’s popular, but is it important?

  1. It’s important because it’s popular. It matters in the real world—it's out there in the wild, shaping how we understand the world. The normal distribution is why most people are of average height, with fewer being extremely tall or short. Many statistical methods are built on the assumption that data follows a normal distribution.
  2. It’s popular but it’s also easy. The normal distribution is incredibly well-behaved. It’s symmetrical, centered around the mean, and its entire shape is governed by just two parameters: the mean (where the curve peaks) and the standard deviation (how spread out it is). This simplicity makes it a dream to work with. Statisticians love it because it’s predictable, and it makes calculations for probabilities, confidence intervals, and hypothesis testing much easier.
  3. Even if it’s not normal, it’s still normal. Not everything in the world follows a normal distribution. Some things are skewed — like income, where most people cluster around the lower range, but a few earn astronomically high amounts. Others have heavy tails, like extreme weather events that are rare but impactful. These things aren’t “normal” in the classic bell curve sense — and that’s perfectly fine. Even when the data itself isn’t normal, it often still becomes normal. Thanks to the central limit theorem, when you sample enough data or take averages, those averages tend to follow the normal distribution, no matter what the original data looked like. This mathematical safety net is what makes the normal distribution so astonishingly universal .

So yeah, the normal distribution is totally normal.

--

--

Damini Vadrevu
Damini Vadrevu

Written by Damini Vadrevu

Humans are complex, and so is our data. I make data science easy to understand here. Welcome!

No responses yet