Introduction to Statistical Testing with SciPy

SciPy - Statistical Testing

3 min read

Published Nov 17 2025

PythonSciPyStatistics

SciPy is one of the core scientific libraries in Python.
It provides a vast set of tools for numerical computing, optimisation, integration, interpolation, signal processing, and more.

This guide focuses specifically on SciPy’s statistical testing capabilities, which live inside the scipy.stats module.

SciPy can be used for:

Optimisation (linear, nonlinear, constrained)
Numerical integration (solving integrals and differential equations)
Linear algebra (solvers, decompositions)
Signal and image processing
Interpolation and smoothing
Probability distributions (PDFs, CDFs, sampling)
Statistical tests and statistical analysis

In this guide, we concentrate on practical statistical testing:

How to run each test
How to interpret the results
When to use each test
Minimal, ready-to-use code examples

What Are Statistical Tests?

Statistical tests help answer questions such as:

Do two groups differ significantly?
Is the average of this sample different from a known value?
Are these two variables correlated?
Does this dataset follow a normal distribution?
Are two categorical variables related?

Most tests follow the same pattern:

Calculate a test statistic
Calculate a p-value
Compare p-value to a significance level (often α = 0.05)
Decide whether to reject or fail to reject the null hypothesis

SciPy gives you concise functions for each of these.

SciPy vs Statsmodels vs Scikit-learn

It’s useful to know where SciPy fits among other major Python libraries.

SciPy

Focus: numerical and scientific tools
Includes probability distributions and hypothesis testing
Does not perform full statistical modelling (e.g., linear regression summaries)

Statsmodels

Focus: classical statistics and econometrics
Provides regression models, ANOVA tables, time-series models
Includes z-tests and proportion tests (which SciPy does not)

Scikit-learn

Focus: machine learning
Predictive modelling, not inference
No hypothesis tests or p-values

For statistical testing alone, SciPy + Statsmodels is the right toolkit.

The `scipy.stats` Module

This is where all the statistical tests in SciPy live:

from scipy import stats

scipy.stats contains:

Parametric tests

One-sample t-test — stats.ttest_1samp
Independent t-test — stats.ttest_ind
Paired t-test — stats.ttest_rel
ANOVA — stats.f_oneway

Non-parametric tests

Mann–Whitney U — stats.mannwhitneyu
Wilcoxon — stats.wilcoxon
Kruskal–Wallis — stats.kruskal

Normality tests

Shapiro–Wilk — stats.shapiro
Kolmogorov–Smirnov — stats.kstest

Chi-square tests

Goodness of fit — stats.chisquare
Independence — stats.chi2_contingency

Correlation

Pearson — stats.pearsonr
Spearman — stats.spearmanr
Kendall tau — stats.kendalltau

Probability distributions

stats.norm, stats.binom, stats.poisson, etc.
PDFs, CDFs, sampling, percentiles

Random variables and sampling

stats.norm.rvs
stats.uniform.rvs
stats.binom.rvs

In this guide, we focus on the practical usage of the tests most relevant to data analysis.

Data Structures Accepted by `scipy.stats`

Most SciPy tests accept:

Python lists
NumPy arrays
Pandas Series
Any array-like iterable

Example:

from scipy import stats

sample = [4.9, 5.1, 5.0, 5.3]

t_stat, p_value = stats.ttest_1samp(sample, popmean=5.0)

All tests return:

a test statistic
a p-value

Example output:

(-0.123, 0.906)

The p-Value and Interpretation (Practically)

We keep this simple and practical:

If p < 0.05:

The result is statistically significant
We reject the null hypothesis
There is likely a real effect or difference

If p ≥ 0.05:

Not significant
We fail to reject the null hypothesis
The evidence is insufficient to claim a difference

Introduction to Statistical Testing with SciPy

SciPy - Statistical Testing

3 min read