Introduction to Statistical Testing with SciPy
SciPy - Statistical Testing
3 min read
Published Nov 17 2025
Guide Sections
Guide Comments
SciPy is one of the core scientific libraries in Python.
It provides a vast set of tools for numerical computing, optimisation, integration, interpolation, signal processing, and more.
This guide focuses specifically on SciPy’s statistical testing capabilities, which live inside the scipy.stats module.
SciPy can be used for:
- Optimisation (linear, nonlinear, constrained)
- Numerical integration (solving integrals and differential equations)
- Linear algebra (solvers, decompositions)
- Signal and image processing
- Interpolation and smoothing
- Probability distributions (PDFs, CDFs, sampling)
- Statistical tests and statistical analysis
In this guide, we concentrate on practical statistical testing:
- How to run each test
- How to interpret the results
- When to use each test
- Minimal, ready-to-use code examples
What Are Statistical Tests?
Statistical tests help answer questions such as:
- Do two groups differ significantly?
- Is the average of this sample different from a known value?
- Are these two variables correlated?
- Does this dataset follow a normal distribution?
- Are two categorical variables related?
Most tests follow the same pattern:
- Calculate a test statistic
- Calculate a p-value
- Compare p-value to a significance level (often α = 0.05)
- Decide whether to reject or fail to reject the null hypothesis
SciPy gives you concise functions for each of these.
SciPy vs Statsmodels vs Scikit-learn
It’s useful to know where SciPy fits among other major Python libraries.
SciPy
- Focus: numerical and scientific tools
- Includes probability distributions and hypothesis testing
- Does not perform full statistical modelling (e.g., linear regression summaries)
Statsmodels
- Focus: classical statistics and econometrics
- Provides regression models, ANOVA tables, time-series models
- Includes z-tests and proportion tests (which SciPy does not)
Scikit-learn
- Focus: machine learning
- Predictive modelling, not inference
- No hypothesis tests or p-values
For statistical testing alone, SciPy + Statsmodels is the right toolkit.
The scipy.stats Module
This is where all the statistical tests in SciPy live:
scipy.stats contains:
Parametric tests
- One-sample t-test —
stats.ttest_1samp - Independent t-test —
stats.ttest_ind - Paired t-test —
stats.ttest_rel - ANOVA —
stats.f_oneway
Non-parametric tests
- Mann–Whitney U —
stats.mannwhitneyu - Wilcoxon —
stats.wilcoxon - Kruskal–Wallis —
stats.kruskal
Normality tests
- Shapiro–Wilk —
stats.shapiro - Kolmogorov–Smirnov —
stats.kstest
Chi-square tests
- Goodness of fit —
stats.chisquare - Independence —
stats.chi2_contingency
Correlation
- Pearson —
stats.pearsonr - Spearman —
stats.spearmanr - Kendall tau —
stats.kendalltau
Probability distributions
stats.norm,stats.binom,stats.poisson, etc.- PDFs, CDFs, sampling, percentiles
Random variables and sampling
stats.norm.rvsstats.uniform.rvsstats.binom.rvs
In this guide, we focus on the practical usage of the tests most relevant to data analysis.
Data Structures Accepted by scipy.stats
Most SciPy tests accept:
- Python lists
- NumPy arrays
- Pandas Series
- Any array-like iterable
Example:
All tests return:
- a test statistic
- a p-value
Example output:
The p-Value and Interpretation (Practically)
We keep this simple and practical:
If p < 0.05:
- The result is statistically significant
- We reject the null hypothesis
- There is likely a real effect or difference
If p ≥ 0.05:
- Not significant
- We fail to reject the null hypothesis
- The evidence is insufficient to claim a difference














