Statistical Hypothesis Testing (SciPy)
End-to-End Machine Learning: Titanic Survival Prediction
1 min read
This section is 1 min read, full guide is 12 min read
Published Nov 18 2025
10
Show sections list
0
Log in to enable the "Like" button
0
Guide comments
0
Log in to enable the "Save" button
Respond to this guide
Guide Sections
Guide Comments
KerasMachine LearningMatplotlibNumPyPandasPythonscikit-learnSciPySeabornTensorFlow
EDA provides visual evidence, but statistical tests confirm whether differences are significant.
Women vs Men survival (chi-square test)
ct = pd.crosstab(titanic["sex"], titanic["survived"])
chi2, p, _, _ = stats.chi2_contingency(ct)
print("Sex vs Survival:", p)
Copy to Clipboard
Output:
Sex vs Survival: 1.197357062775565e-58
Result: p ≪ 0.05, meaning gender strongly affects survival statistically.
Children vs Adults survival
titanic["is_child"] = titanic["age"] < 16
ct = pd.crosstab(titanic["is_child"], titanic["survived"])
chi2, p, _, _ = stats.chi2_contingency(ct)
print("Children vs Adults:", p)
Copy to Clipboard
Output:
Children vs Adults: 8.005497211300109e-05
Children have meaningfully different survival outcomes.
Passenger class
ct = pd.crosstab(titanic["pclass"], titanic["survived"])
chi2, p, _, _ = stats.chi2_contingency(ct)
print("Class vs Survival:", p)
Copy to Clipboard
Output:
Class vs Survival: 4.549251711298793e-23
Copy to Clipboard
Passenger class is highly significant.
Age differences (t-test)
age_data = titanic[["survived", "age"]].dropna()
died = age_data[age_data["survived"]==0]["age"]
surv = age_data[age_data["survived"]==1]["age"]
t, p = stats.ttest_ind(died, surv, equal_var=False)
print("Age difference:", p)
Copy to Clipboard
Output:
Age difference: 0.04118965162586638
Age has a statistically significant association with survival, but it is not nearly as strong a predictor as sex or class.














