Scikit-Learn Models

End-to-End Machine Learning: Titanic Survival Prediction

1 min read

Published Nov 18 2025


10
0
0
0

KerasMachine LearningMatplotlibNumPyPandasPythonscikit-learnSciPySeabornTensorFlow

We start with two widely used models.



Logistic Regression

log_reg = Pipeline([
    ("prep", preprocessor),
    ("clf", LogisticRegression(max_iter=1000))
])

log_reg.fit(X_train, y_train)

pred_lr = log_reg.predict(X_test)
prob_lr = log_reg.predict_proba(X_test)[:, 1]

print("LogReg Accuracy:", accuracy_score(y_test, pred_lr))
print("LogReg ROC-AUC:", roc_auc_score(y_test, prob_lr))

Output:

LogReg Accuracy: 0.8324022346368715
LogReg ROC-AUC: 0.8699604743083004

Plot a ROC Curve for Logistic Regression:

RocCurveDisplay.from_predictions(
    y_test,
    prob_lr,
    name="Logistic Regression",
    color="blue"
)

plt.plot([0, 1], [0, 1], "k--", label="Chance")
plt.title("ROC Curve - Logistic Regression")
plt.legend()
plt.show()

logistic regression ROC curve

Confusion Matrix Heatmap

cm = confusion_matrix(y_test, pred_lr)

sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix - Logistic Regression")
plt.show()

logistic regression confusion matrix





Random Forest

rf = Pipeline([
    ("prep", preprocessor),
    ("clf", RandomForestClassifier(
        n_estimators=200,
        random_state=42,
        class_weight="balanced"
    ))
])

rf.fit(X_train, y_train)

pred_rf = rf.predict(X_test)
prob_rf = rf.predict_proba(X_test)[:,1]

print("RandomForest Accuracy:", accuracy_score(y_test, pred_rf))
print("RandomForest ROC-AUC:", roc_auc_score(y_test, prob_rf))

Output:

RandomForest Accuracy: 0.8212290502793296
RandomForest ROC-AUC: 0.8373517786561265

Plot a ROC Curve for Random Forrest

RocCurveDisplay.from_predictions(
    y_test,
    prob_rf,
    name="Random Forest",
    color="blue"
)

plt.plot([0, 1], [0, 1], "k--", label="Chance")
plt.title("ROC Curve - Random Forest")
plt.legend()
plt.show()

random forest ROC curve

Confusion Matrix Heatmap

cm = confusion_matrix(y_test, pred_rf)

sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix - Random Forest")
plt.show()

random forest confusion matrix

Products from our shop

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Mug

Docker Cheat Sheet Mug

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Mug

Vim Cheat Sheet Mug

SimpleSteps.guide branded Travel Mug

SimpleSteps.guide branded Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - Black

Developer Excuse Javascript Mug - Black

SimpleSteps.guide branded stainless steel water bottle

SimpleSteps.guide branded stainless steel water bottle

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Dark

Developer Excuse Javascript Hoodie - Dark

© 2025 SimpleSteps.guide
AboutFAQPoliciesContact