Practical Statistical Scenarios

SciPy - Statistical Testing

2 min read

Published Nov 17 2025

PythonSciPyStatistics

This section shows how to pick the right statistical test and perform full workflows for common real-world situations:

A/B testing (web experiments)
Before/after tests (clinical & performance improvements)
Survey & Likert-scale analysis
Group comparisons (treatment groups, product versions)
Check assumptions → run test → compute effect size → interpret
Full “test selection workflow”

The focus is practical: what tests to run, how to run them, and how to interpret the results.

Scenario 1 — A/B Test (Conversion Rates)

Binary outcome → Proportion test

Question:
Is version B’s conversion rate higher than A’s?

Data:

Group A: 200 conversions out of 2500 visits
Group B: 260 conversions out of 2480 visits

Step 1: Extract counts

import numpy as np

from statsmodels.stats.proportion import proportions_ztest

count = np.array([200, 260])

nobs = np.array([2500, 2480])

Step 2: Run the test

stat, p = proportions_ztest(count, nobs)

print(stat, p)

Step 3: Interpret

p < 0.05 → conversion rates differ
Check proportions:

p_A = 200/2500

p_B = 260/2480

Step 4: Effect size (Cohen’s h)

def cohens_h(p1, p2):

import numpy as np

return 2*np.arcsin(np.sqrt(p1)) - 2*np.arcsin(np.sqrt(p2))

print(cohens_h(p_A, p_B))

Step 5: Confidence intervals

from statsmodels.stats.proportion import proportion_confint

print(proportion_confint(200, 2500))

print(proportion_confint(260, 2480))

Scenario 2 — Before/After Measurements (Paired Data)

Continuous → Paired t-test or Wilcoxon

Question:
Did a new process improve response time?

Data:

before = [105, 110, 120, 100, 115]

after = [98, 100, 109, 95, 102]

Step 1: Check normality of differences

from scipy import stats

diff = np.array(after) - np.array(before)

stats.shapiro(diff)

If:

p ≥ 0.05 → use paired t-test
p < 0.05 → use Wilcoxon signed-rank

Step 2 (Option A): Paired t-test

t, p = stats.ttest_rel(before, after)

print(t, p)

Step 2 (Option B): Wilcoxon (non-parametric)

w, p = stats.wilcoxon(before, after)

print(w, p)

Step 3: Effect size

def cohens_d_paired(x, y):

d = np.array(y) - np.array(x)

return np.mean(d) / np.std(d, ddof=1)

print(cohens_d_paired(before, after))

Scenario 3 — Comparing 3+ Groups (ANOVA or Kruskal–Wallis)

Independent groups → ANOVA or Kruskal-Wallis

Question:
Do three marketing channels differ in average revenue per user?

Data:

group_A = [12, 15, 14, 13]

group_B = [18, 17, 19, 16]

group_C = [10, 11, 12, 9]

groups = [group_A, group_B, group_C]

Step 1: Check normality of each group

for g in groups:

print(stats.shapiro(g))

Step 2: Check equal variances

stats.levene(group_A, group_B, group_C)

Choose:

ANOVA if normal & equal variances
Kruskal–Wallis if not

ANOVA

f, p = stats.f_oneway(group_A, group_B, group_C)

print(f, p)

Kruskal–Wallis

h, p = stats.kruskal(group_A, group_B, group_C)

print(h, p)

Effect size (eta-squared)

def eta_squared_anova(groups):

all_data = np.concatenate(groups)

grand_mean = np.mean(all_data)

ss_between = sum(len(g)*(np.mean(g)-grand_mean)**2 for g in groups)

ss_total = sum((x-grand_mean)**2 for x in all_data)

return ss_between / ss_total

eta_squared_anova(groups)

If significant → run post-hoc tests

Example: Tukey’s HSD (via Statsmodels)

Scenario 4 — Survey / Likert-Scale Analysis

Ordinal data → non-parametric tests

Likert data (1–5 ratings) is ordinal, not interval.

Group example:

satisfaction_A = [4, 5, 4, 3, 5]

satisfaction_B = [3, 4, 3, 2, 4]

Recommended tests

2 groups → Mann–Whitney U
3+ groups → Kruskal–Wallis
Paired → Wilcoxon

Example

stats.mannwhitneyu(satisfaction_A, satisfaction_B)

Effect size

# rank-biserial correlation

u, p = stats.mannwhitneyu(satisfaction_A, satisfaction_B)

rbc = 1 - (2*u)/(len(satisfaction_A)*len(satisfaction_B))

print(rbc)

Scenario 5 — Categorical Comparisons (Chi-Square)

Two categorical variables → Chi-square or Fisher's

Example:
Does product preference differ by age group?

Data:

table = np.array([

# Young: Prefer A / B

[30, 10],

# Older: Prefer A / B

[20, 40]

])

Chi-square test

chi2, p, dof, expected = stats.chi2_contingency(table)

print(chi2, p)

If expected counts < 5 → Fisher’s Exact

stats.fisher_exact(table)

Effect size (Cramér’s V)

def cramers_v(table):

chi2, p, dof, expected = stats.chi2_contingency(table)

n = table.sum()

k = min(table.shape) - 1

return np.sqrt(chi2 / (n * k))

print(cramers_v(table))

Scenario 6 — Correlation Testing

Question:
Is time spent on site related to revenue?

time_spent = [1,2,3,4,5]

revenue = [10,20,25,24,28]

Step 1: Check linearity

Plot or eyeball.

Step 2: Choose test

Pearson → linear
Spearman → monotonic

Test

corr, p = stats.pearsonr(time_spent, revenue)

print(corr, p)

Plot

import matplotlib.pyplot as plt

plt.scatter(time_spent, revenue)

plt.show()

Practical Statistical Scenarios

SciPy - Statistical Testing

2 min read

Published Nov 17 2025

Guide Sections

Guide Comments

Scenario 1 — A/B Test (Conversion Rates)

Scenario 2 — Before/After Measurements (Paired Data)

Scenario 3 — Comparing 3+ Groups (ANOVA or Kruskal–Wallis)

Scenario 4 — Survey / Likert-Scale Analysis

Scenario 5 — Categorical Comparisons (Chi-Square)

Scenario 6 — Correlation Testing

Products from our shop

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Mug

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Mug

SimpleSteps.guide branded Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - Black

SimpleSteps.guide branded stainless steel water bottle

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Dark