Normal (Gaussian) Distribution

Maths: Statistics for machine learning

2 min read

Published Oct 22 2025, updated Oct 23 2025

Machine LearningMathsNumPyPandasPythonStatistics

The Normal Distribution — also called the Gaussian Distribution — is a continuous probability distribution that describes data that cluster around a central mean.

It is the classic “bell-shaped curve” that appears in nature, data, and model errors.

In simple terms:

“Most observations are close to the mean, and the probability of extreme values decreases symmetrically on both sides.”

Probability Density Function (PDF)

Where:

μ = mean (centre of the distribution)
σ = standard deviation (spread/width of the curve)
e = 2.718 (Euler’s number)

The total area under the curve = 1

68–95–99.7 Rule (Empirical Rule)

For a normal distribution:

~68% of values lie within 1σ (1 standard deviation) of the mean
~95% within 2σ (1 standard deviation) of the mean
~99.7% within 3σ (1 standard deviation) of the mean

Examples

Human height (mean 170cm) - Most people near 170 cm
IQ scores (mean 100) - 68% of people score between 85–115
Measurement errors (mean 0) - Random noise around true value

A bell-shaped curve centred at μ=0 with shaded regions for:

68% within ±1σ
95% within ±2σ
99.7% within ±3σ

The total area under the curve = 1
The red dashed line marks the mean (also the median and mode)

In Machine Learning

Model errors / residuals - Assumed to follow Normal distribution in regression
Feature normalisation - Many ML algorithms perform better on normally distributed features
Gaussian Naive Bayes - Uses the Normal PDF to model continuous features
Statistical tests (Z-test, t-test) - Based on normality assumptions
Initialisation / noise models - Random weight initialisation, dropout noise, etc.