Model Optimisation - Callbacks & Regularisation
Keras Basics
2 min read
Published Nov 17 2025
Guide Sections
Guide Comments
Building a model is only half the work, training it effectively is the other half.
In this section you’ll learn:
- Early Stopping
- Model Checkpointing
- Learning Rate Scheduling
- ReduceLROnPlateau
- TensorBoard logging
- Dropout
- L2 Regularisation
- Batch Normalisation
- Combining multiple optimisation techniques
These tools make your models:
- More accurate
- More stable
- Less overfitted
- Easier to monitor
- Easier to resume training
EarlyStopping
Stops training when the model stops improving.
Example:
Use in training:
Why important?
- Prevents overfitting
- Saves time
- Automatically chooses the best epoch
ModelCheckpoint
Automatically saves the best version of your model.
Use:
Then load it later:
ReduceLROnPlateau
Automatically lowers the learning rate when validation loss stops improving.
Useful for:
- Fine-tuning pre-trained networks
- Stubborn plateaus
- Avoiding oscillations
Learning Rate Schedulers
Custom control over learning rates.
Step decay:
Cosine decay example:
TensorBoard (Training Visualisation)
TensorBoard is essential for monitoring training curves.
Enable it:
Start TensorBoard:
View:
- Loss curves
- Accuracy curves
- Learning rate
- Weight histograms
- Graph structure
Dropout (Regularisation)
Randomly disables a fraction of neurones during training.
Typical rates:
- 0.1 – 0.3 for CNNs
- 0.3 – 0.6 for fully connected layers
- RNNs use special dropout parameters
Dropout reduces overfitting by encouraging redundancy and robustness.
L2 Weight Regularisation
Adds a penalty for large weights to reduce overfitting.
Use when:
- You have many parameters
- Dataset is small
- Model is overfitting heavily
Batch Normalisation
Normalises activations inside layers, improving stability.
Benefits:
- Faster training
- Can use higher learning rates
- Often boosts accuracy
- Reduces internal covariate shift
Typically placed after Conv/Dense layers, before activation:
Combining Regularisation Techniques
A robust CNN layer block might look like this:
A robust dense block:
Putting It All Together
Example model with multiple optimising components:
Callbacks:
This combination is used in many production-grade pipelines.














