Scikit-Learn Models
End-to-End Machine Learning: Titanic Survival Prediction
1 min read
This section is 1 min read, full guide is 12 min read
Published Nov 18 2025
10
Show sections list
0
Log in to enable the "Like" button
0
Guide comments
0
Log in to enable the "Save" button
Respond to this guide
Guide Sections
Guide Comments
KerasMachine LearningMatplotlibNumPyPandasPythonscikit-learnSciPySeabornTensorFlow
We start with two widely used models.
Logistic Regression
log_reg = Pipeline([
("prep", preprocessor),
("clf", LogisticRegression(max_iter=1000))
])
log_reg.fit(X_train, y_train)
pred_lr = log_reg.predict(X_test)
prob_lr = log_reg.predict_proba(X_test)[:, 1]
print("LogReg Accuracy:", accuracy_score(y_test, pred_lr))
print("LogReg ROC-AUC:", roc_auc_score(y_test, prob_lr))
Copy to Clipboard
Output:
LogReg Accuracy: 0.8324022346368715
LogReg ROC-AUC: 0.8699604743083004
Plot a ROC Curve for Logistic Regression:
RocCurveDisplay.from_predictions(
y_test,
prob_lr,
name="Logistic Regression",
color="blue"
)
plt.plot([0, 1], [0, 1], "k--", label="Chance")
plt.title("ROC Curve - Logistic Regression")
plt.legend()
plt.show()
Copy to Clipboard

Confusion Matrix Heatmap
cm = confusion_matrix(y_test, pred_lr)
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix - Logistic Regression")
plt.show()
Copy to Clipboard

Random Forest
rf = Pipeline([
("prep", preprocessor),
("clf", RandomForestClassifier(
n_estimators=200,
random_state=42,
class_weight="balanced"
))
])
rf.fit(X_train, y_train)
pred_rf = rf.predict(X_test)
prob_rf = rf.predict_proba(X_test)[:,1]
print("RandomForest Accuracy:", accuracy_score(y_test, pred_rf))
print("RandomForest ROC-AUC:", roc_auc_score(y_test, prob_rf))
Copy to Clipboard
Output:
RandomForest Accuracy: 0.8212290502793296
RandomForest ROC-AUC: 0.8373517786561265
Plot a ROC Curve for Random Forrest
RocCurveDisplay.from_predictions(
y_test,
prob_rf,
name="Random Forest",
color="blue"
)
plt.plot([0, 1], [0, 1], "k--", label="Chance")
plt.title("ROC Curve - Random Forest")
plt.legend()
plt.show()
Copy to Clipboard

Confusion Matrix Heatmap
cm = confusion_matrix(y_test, pred_rf)
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix - Random Forest")
plt.show()
Copy to Clipboard















