Model Optimisation - Callbacks & Regularisation

Keras Basics

2 min read

Published Nov 17 2025

KerasNeural NetworksPythonTensorFlow

Building a model is only half the work, training it effectively is the other half.

In this section you’ll learn:

Early Stopping
Model Checkpointing
Learning Rate Scheduling
ReduceLROnPlateau
TensorBoard logging
Dropout
L2 Regularisation
Batch Normalisation
Combining multiple optimisation techniques

These tools make your models:

More accurate
More stable
Less overfitted
Easier to monitor
Easier to resume training

EarlyStopping

Stops training when the model stops improving.

Example:

from tensorflow.keras.callbacks import EarlyStopping

early_stop = EarlyStopping(

monitor='val_loss',

patience=3,

restore_best_weights=True

)

Use in training:

model.fit(

train_ds,

validation_data=val_ds,

epochs=50,

callbacks=[early_stop]

)

Why important?

Prevents overfitting
Saves time
Automatically chooses the best epoch

ModelCheckpoint

Automatically saves the best version of your model.

from tensorflow.keras.callbacks import ModelCheckpoint

checkpoint = ModelCheckpoint(

"best_model.keras",

monitor="val_loss",

save_best_only=True

)

Use:

model.fit(

train_ds,

validation_data=val_ds,

epochs=50,

callbacks=[checkpoint]

)

Then load it later:

best_model = tf.keras.models.load_model("best_model.keras")

ReduceLROnPlateau

Automatically lowers the learning rate when validation loss stops improving.

from tensorflow.keras.callbacks import ReduceLROnPlateau

reduce_lr = ReduceLROnPlateau(

monitor='val_loss',

factor=0.5,

patience=2,

min_lr=1e-7

)

Useful for:

Fine-tuning pre-trained networks
Stubborn plateaus
Avoiding oscillations

Learning Rate Schedulers

Custom control over learning rates.

Step decay:

def step_decay(epoch):

if epoch < 10:

return 1e-3

elif epoch < 20:

return 1e-4

else:

return 1e-5

scheduler = tf.keras.callbacks.LearningRateScheduler(step_decay)

Cosine decay example:

scheduler = tf.keras.optimizers.schedules.CosineDecay(

initial_learning_rate=0.001,

decay_steps=1000

)

optimizer = tf.keras.optimizers.Adam(scheduler)

TensorBoard (Training Visualisation)

TensorBoard is essential for monitoring training curves.

Enable it:

tensorboard_cb = tf.keras.callbacks.TensorBoard(

log_dir="logs",

histogram_freq=1

)

Start TensorBoard:

tensorboard --logdir logs

View:

Loss curves
Accuracy curves
Learning rate
Weight histograms
Graph structure

Dropout (Regularisation)

Randomly disables a fraction of neurones during training.

layers.Dropout(0.5)

Typical rates:

0.1 – 0.3 for CNNs
0.3 – 0.6 for fully connected layers
RNNs use special dropout parameters

Dropout reduces overfitting by encouraging redundancy and robustness.

L2 Weight Regularisation

Adds a penalty for large weights to reduce overfitting.

layers.Dense(64,

activation='relu',

kernel_regularizer=tf.keras.regularizers.l2(0.001)

)

Use when:

You have many parameters
Dataset is small
Model is overfitting heavily

Batch Normalisation

Normalises activations inside layers, improving stability.

layers.BatchNormalization()

Benefits:

Faster training
Can use higher learning rates
Often boosts accuracy
Reduces internal covariate shift

Typically placed after Conv/Dense layers, before activation:

layers.Conv2D(32, 3, use_bias=False),

layers.BatchNormalization(),

layers.ReLU(),

Combining Regularisation Techniques

A robust CNN layer block might look like this:

layers.Conv2D(64, 3, padding='same', use_bias=False),

layers.BatchNormalization(),

layers.ReLU(),

layers.Dropout(0.3)

A robust dense block:

layers.Dense(128, activation="relu",

kernel_regularizer=tf.keras.regularizers.l2(0.001)),

layers.Dropout(0.5),

Putting It All Together

Example model with multiple optimising components:

model = Sequential([

layers.Conv2D(32, 3, activation='relu'),

layers.BatchNormalization(),

layers.MaxPooling2D(),

layers.Dropout(0.3),

layers.Conv2D(64, 3, activation='relu'),

layers.BatchNormalization(),

layers.MaxPooling2D(),

layers.Dropout(0.3),

layers.Flatten(),

layers.Dense(128, activation='relu',

kernel_regularizer=tf.keras.regularizers.l2(0.001)),

layers.Dropout(0.5),

layers.Dense(10, activation='softmax')

])

model.compile(

optimizer=tf.keras.optimizers.Adam(1e-3),

loss="sparse_categorical_crossentropy",

metrics=["accuracy"]

)

Callbacks:

callbacks = [

EarlyStopping(patience=5, restore_best_weights=True),

ModelCheckpoint("best_model.keras", save_best_only=True),

ReduceLROnPlateau(patience=2),

tf.keras.callbacks.TensorBoard(log_dir="logs")

]

history = model.fit(

train_ds,

validation_data=val_ds,

epochs=50,

callbacks=callbacks

)

This combination is used in many production-grade pipelines.

Model Optimisation - Callbacks & Regularisation

Keras Basics

2 min read

Published Nov 17 2025

Guide Sections

Guide Comments

EarlyStopping

ModelCheckpoint

ReduceLROnPlateau

Learning Rate Schedulers

TensorBoard (Training Visualisation)

Dropout (Regularisation)

L2 Weight Regularisation

Batch Normalisation

Combining Regularisation Techniques

Putting It All Together

Products from our shop

Docker Cheat Sheet - Print at Home Designs

Docker Cheat Sheet Mouse Mat

Docker Cheat Sheet Travel Mug

Docker Cheat Sheet Mug

Vim Cheat Sheet - Print at Home Designs

Vim Cheat Sheet Mouse Mat

Vim Cheat Sheet Travel Mug

Vim Cheat Sheet Mug

SimpleSteps.guide branded Travel Mug

Developer Excuse Javascript - Travel Mug

Developer Excuse Javascript Embroidered T-Shirt - Dark

Developer Excuse Javascript Embroidered T-Shirt - Light

Developer Excuse Javascript Mug - White

Developer Excuse Javascript Mug - Black

SimpleSteps.guide branded stainless steel water bottle

Developer Excuse Javascript Hoodie - Light

Developer Excuse Javascript Hoodie - Dark