Shapiro–Wilk Test

Maths: Statistics for machine learning

2 min read

Published Oct 22 2025, updated Oct 23 2025

40

0

0

0

Machine LearningMathsNumPyPandasPythonStatistics

he Shapiro–Wilk test is a statistical test for normality — it checks whether a given dataset is drawn from a normal (Gaussian) distribution.

It’s one of the most powerful and widely used normality tests, especially for small to medium sample sizes (n < 5000).

In simple terms:

“The Shapiro–Wilk test checks if your data follow a bell-shaped normal distribution.”

When to Use It

Continuous data - Works on numerical values
Small to medium samples - Best for n < 5000
Need to choose test type - Helps decide between parametric and non-parametric tests

When Not to Use It

Categorical data - Not suitable
Large samples - Even tiny deviations appear significant (use visual + other tests too)

Hypotheses

H₀ (Null Hypothesis) - The data are normally distributed
H₁ (Alternative Hypothesis) - The data are not normally distributed

Test Statistic

The Shapiro–Wilk test computes a W statistic, which measures how close your sample’s distribution is to a normal one.

Shapiro formula

Where:

x_(i): ordered sample values (sorted smallest → largest)
a_i: constants from expected normal distribution
X̅: sample mean

If W ≈ 1, data are close to normal.
If W is much smaller, data deviate from normality.

Decision Rule

p > 0.05 - Fail to reject H₀ → data look normal
p ≤ 0.05 - Reject H₀ → data not normal

Example in Python

Let’s check if a dataset follows a normal distribution.

import numpy as np

from scipy.stats import shapiro

# Example data

data = np.array([10, 12, 11, 14, 13, 15, 12, 11, 13, 14])

# Perform Shapiro–Wilk Test

stat, p = shapiro(data)

print(f"Shapiro–Wilk Statistic: {stat:.3f}")

print(f"P-value: {p:.4f}")

# Interpret

if p > 0.05:

print("Data look normally distributed (fail to reject H₀).")

else:

print("Data are not normally distributed (reject H₀).")

Example Output:

Shapiro–Wilk Statistic: 0.967

P-value: 0.8234

Data look normally distributed (fail to reject H₀).

Example with Non-Normal Data

# Create skewed data

skewed = np.random.exponential(scale=2, size=100)

stat, p = shapiro(skewed)

print(f"W={stat:.3f}, p={p:.4f}")

if p > 0.05:

print("Normal")

else:

print("Not normal")

Example output:

W=0.812, p=0.0001

Not normal

Visual Check:

Shapiro plot

Histogram and KDE

Shapiro QQ plot

QQ Plot

If the points follow a straight line in the Q–Q plot → roughly normal, Curved or S-shaped patterns → not normal

Products from our shop

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Mug

Docker Cheat Sheet Mug

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Mug

Vim Cheat Sheet Mug

SimpleSteps.guide branded Travel Mug

SimpleSteps.guide branded Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - Black

Developer Excuse Javascript Mug - Black

SimpleSteps.guide branded stainless steel water bottle

SimpleSteps.guide branded stainless steel water bottle

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Dark

Developer Excuse Javascript Hoodie - Dark

© 2025 SimpleSteps.guide

About Us FAQ Policies Contact Us