Normality Tests
SciPy - Statistical Testing
3 min read
Published Nov 17 2025
Guide Sections
Guide Comments
Normality tests help you answer a simple but crucial question:
“Does this data look like it came from a normal distribution?”
Why this matters:
- Parametric tests (t-tests, ANOVA) assume normality
- Non-parametric tests don’t
- Many statistical decisions begin here
SciPy provides several ways to test normality, each with its strengths.
Practical Interpretation of Normality Tests
All normality tests return:
- test statistic
- p-value
Interpretation is the same for all:
If p ≥ 0.05
- Data is consistent with normal distribution
- You can reasonably use parametric tests
If p < 0.05
- Data is NOT normally distributed
- Consider non-parametric tests (Chapter 3)
Important: Normality tests are very sensitive to sample size:
- Large samples → even tiny deviations become “significant”
- Small samples → tests often have low power
Always pair normality tests with a plot (histogram, Q-Q plot).
Shapiro–Wilk Test (Recommended)
The go-to normality test for small to medium samples.
Works best for:
- Sample sizes up to ~5000
- General-purpose normality checking
- Pre-checking assumptions for t-tests or ANOVA
Example
Interpretation
p < 0.05→ NOT normalp ≥ 0.05→ normal enough
Advantages
- Most powerful normality test
- Works well for small samples
Disadvantages
- Too sensitive for huge datasets
Kolmogorov–Smirnov (K–S Test)
Compares data to a specified distribution.
For normality, you must supply:
- mean
- standard deviation
Example
Interpretation
Same general rule:
p < 0.05→ not normalp ≥ 0.05→ consistent with normality
Notes
- Less sensitive than Shapiro–Wilk
- Not recommended for small samples
- Good for checking against any distribution, not just normal
Anderson–Darling Test
This test always returns a decision threshold, not a simple p-value.
Example
Interpretation
If the test statistic is:
> critical value→ reject normality<= critical value→ fail to reject
Advantages
- More sensitive in the tails
- Good for moderate to large samples
Disadvantages
- Slightly more complicated interpretation
- No simple p-value
D’Agostino and Pearson’s Test (K2 Test)
Combines skewness and kurtosis to test normality.
Example
Use when:
- Sample size ≥ 20
- You want a test sensitive to deviations in skewness and kurtosis
Avoid when:
- Very small samples (< 20)
Visual Normality Checks (Highly Recommended)
Don’t rely solely on p-values — always look at the distribution.
Histogram
Q-Q Plot
- If points ≈ straight line → data is approx. normal
- If points bend or curve → non-normal
Choosing the Right Normality Test
Test | Best For | Avoid When | Notes |
Shapiro–Wilk | Small–medium samples (<5000) | Very large samples | Most widely used |
K–S Test | Comparing to any distribution | Small samples | Requires specifying mean & sd |
Anderson–Darling | Mild deviations in tails | Need simple p-value | Very sensitive |
D’Agostino K2 | Sample ≥ 20, skew/kurtosis detection | Small samples | Good for moderate sizes |














