Normal (Gaussian) Distribution
Maths: Statistics for machine learning
2 min read
Published Oct 22 2025, updated Oct 23 2025
Guide Sections
Guide Comments
The Normal Distribution — also called the Gaussian Distribution — is a continuous probability distribution that describes data that cluster around a central mean.
It is the classic “bell-shaped curve” that appears in nature, data, and model errors.
In simple terms:
“Most observations are close to the mean, and the probability of extreme values decreases symmetrically on both sides.”
Probability Density Function (PDF)

Where:
- μ = mean (centre of the distribution)
- σ = standard deviation (spread/width of the curve)
- e = 2.718 (Euler’s number)
The total area under the curve = 1
68–95–99.7 Rule (Empirical Rule)
For a normal distribution:
- ~68% of values lie within 1σ (1 standard deviation) of the mean
- ~95% within 2σ (1 standard deviation) of the mean
- ~99.7% within 3σ (1 standard deviation) of the mean
Examples
- Human height (mean 170cm) - Most people near 170 cm
- IQ scores (mean 100) - 68% of people score between 85–115
- Measurement errors (mean 0) - Random noise around true value

A bell-shaped curve centred at μ=0 with shaded regions for:
- 68% within ±1σ
- 95% within ±2σ
- 99.7% within ±3σ
The total area under the curve = 1
The red dashed line marks the mean (also the median and mode)
In Machine Learning
- Model errors / residuals - Assumed to follow Normal distribution in regression
- Feature normalisation - Many ML algorithms perform better on normally distributed features
- Gaussian Naive Bayes - Uses the Normal PDF to model continuous features
- Statistical tests (Z-test, t-test) - Based on normality assumptions
- Initialisation / noise models - Random weight initialisation, dropout noise, etc.
Python code
Test if all numerical columns in a DataFrame are normally distributed with pg.normality():
The arguments we parse are: data, alpha=0.05 for the significance level
Output:














