Practical Statistical Scenarios

SciPy - Statistical Testing

2 min read

Published Nov 17 2025


9
0
0
0

PythonSciPyStatistics

This section shows how to pick the right statistical test and perform full workflows for common real-world situations:

  • A/B testing (web experiments)
  • Before/after tests (clinical & performance improvements)
  • Survey & Likert-scale analysis
  • Group comparisons (treatment groups, product versions)
  • Check assumptions → run test → compute effect size → interpret
  • Full “test selection workflow”

The focus is practical: what tests to run, how to run them, and how to interpret the results.






Scenario 1 — A/B Test (Conversion Rates)

Binary outcome → Proportion test

Question:
Is version B’s conversion rate higher than A’s?


Data:

  • Group A: 200 conversions out of 2500 visits
  • Group B: 260 conversions out of 2480 visits

Step 1: Extract counts

import numpy as np
from statsmodels.stats.proportion import proportions_ztest

count = np.array([200, 260])
nobs = np.array([2500, 2480])

Step 2: Run the test

stat, p = proportions_ztest(count, nobs)
print(stat, p)

Step 3: Interpret

  • p < 0.05 → conversion rates differ
  • Check proportions:
p_A = 200/2500
p_B = 260/2480

Step 4: Effect size (Cohen’s h)

def cohens_h(p1, p2):
    import numpy as np
    return 2*np.arcsin(np.sqrt(p1)) - 2*np.arcsin(np.sqrt(p2))

print(cohens_h(p_A, p_B))

Step 5: Confidence intervals

from statsmodels.stats.proportion import proportion_confint
print(proportion_confint(200, 2500))
print(proportion_confint(260, 2480))





Scenario 2 — Before/After Measurements (Paired Data)

Continuous → Paired t-test or Wilcoxon

Question:
Did a new process improve response time?


Data:

before = [105, 110, 120, 100, 115]
after = [98, 100, 109, 95, 102]

Step 1: Check normality of differences

from scipy import stats
diff = np.array(after) - np.array(before)
stats.shapiro(diff)

If:

  • p ≥ 0.05 → use paired t-test
  • p < 0.05 → use Wilcoxon signed-rank

Step 2 (Option A): Paired t-test

t, p = stats.ttest_rel(before, after)
print(t, p)

Step 2 (Option B): Wilcoxon (non-parametric)

w, p = stats.wilcoxon(before, after)
print(w, p)

Step 3: Effect size

def cohens_d_paired(x, y):
    d = np.array(y) - np.array(x)
    return np.mean(d) / np.std(d, ddof=1)

print(cohens_d_paired(before, after))





Scenario 3 — Comparing 3+ Groups (ANOVA or Kruskal–Wallis)

Independent groups → ANOVA or Kruskal-Wallis

Question:
Do three marketing channels differ in average revenue per user?


Data:

group_A = [12, 15, 14, 13]
group_B = [18, 17, 19, 16]
group_C = [10, 11, 12, 9]
groups = [group_A, group_B, group_C]

Step 1: Check normality of each group

for g in groups:
    print(stats.shapiro(g))

Step 2: Check equal variances

stats.levene(group_A, group_B, group_C)

Choose:

  • ANOVA if normal & equal variances
  • Kruskal–Wallis if not

ANOVA

f, p = stats.f_oneway(group_A, group_B, group_C)
print(f, p)

Kruskal–Wallis

h, p = stats.kruskal(group_A, group_B, group_C)
print(h, p)

Effect size (eta-squared)

def eta_squared_anova(groups):
    all_data = np.concatenate(groups)
    grand_mean = np.mean(all_data)
    ss_between = sum(len(g)*(np.mean(g)-grand_mean)**2 for g in groups)
    ss_total = sum((x-grand_mean)**2 for x in all_data)
    return ss_between / ss_total

eta_squared_anova(groups)

If significant → run post-hoc tests

Example: Tukey’s HSD (via Statsmodels)






Scenario 4 — Survey / Likert-Scale Analysis

Ordinal data → non-parametric tests

Likert data (1–5 ratings) is ordinal, not interval.


Group example:

satisfaction_A = [4, 5, 4, 3, 5]
satisfaction_B = [3, 4, 3, 2, 4]

Recommended tests

  • 2 groups → Mann–Whitney U
  • 3+ groups → Kruskal–Wallis
  • Paired → Wilcoxon

Example

stats.mannwhitneyu(satisfaction_A, satisfaction_B)

Effect size

# rank-biserial correlation
u, p = stats.mannwhitneyu(satisfaction_A, satisfaction_B)
rbc = 1 - (2*u)/(len(satisfaction_A)*len(satisfaction_B))
print(rbc)





Scenario 5 — Categorical Comparisons (Chi-Square)

Two categorical variables → Chi-square or Fisher's

Example:
Does product preference differ by age group?


Data:

table = np.array([
    # Young: Prefer A / B
    [30, 10],
    # Older: Prefer A / B
    [20, 40]
])

Chi-square test

chi2, p, dof, expected = stats.chi2_contingency(table)
print(chi2, p)

If expected counts < 5 → Fisher’s Exact

stats.fisher_exact(table)

Effect size (Cramér’s V)

def cramers_v(table):
    chi2, p, dof, expected = stats.chi2_contingency(table)
    n = table.sum()
    k = min(table.shape) - 1
    return np.sqrt(chi2 / (n * k))

print(cramers_v(table))





Scenario 6 — Correlation Testing

Question:
Is time spent on site related to revenue?

time_spent = [1,2,3,4,5]
revenue = [10,20,25,24,28]

Step 1: Check linearity

Plot or eyeball.


Step 2: Choose test

  • Pearson → linear
  • Spearman → monotonic

Test

corr, p = stats.pearsonr(time_spent, revenue)
print(corr, p)

Plot

import matplotlib.pyplot as plt
plt.scatter(time_spent, revenue)
plt.show()

Products from our shop

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Mug

Docker Cheat Sheet Mug

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Mug

Vim Cheat Sheet Mug

SimpleSteps.guide branded Travel Mug

SimpleSteps.guide branded Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - Black

Developer Excuse Javascript Mug - Black

SimpleSteps.guide branded stainless steel water bottle

SimpleSteps.guide branded stainless steel water bottle

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Dark

Developer Excuse Javascript Hoodie - Dark

© 2025 SimpleSteps.guide
AboutFAQPoliciesContact