Recurrent & Sequence Models (RNN, LSTM, GRU)
Keras Basics
2 min read
Published Nov 17 2025
Guide Sections
Guide Comments
Recurrent neural networks (RNNs) are designed for sequential data, where order matters.
Examples:
- Text (sentences, documents)
- Time series (stock prices, weather)
- Event logs
- DNA sequences
- Audio
In this chapter you’ll learn how to use:
- SimpleRNN
- LSTM (Long Short-Term Memory)
- GRU (Gated Recurrent Unit)
We’ll use the IMDB sentiment analysis dataset again, but this time with RNN layers instead of a simple embedding + pooling network previously.
Load the IMDB Dataset
We use the top 10,000 most frequent words:
Data:
x_train[i]is a list of integer word indices- Length varies per review
Pad Sequences
RNNs require fixed-length sequences, so we pad them:
Build an LSTM Model
LSTM networks are the most popular and performant RNN type for text.
LSTM Model Architecture:
Explanation:
- Embedding layer → converts word indices into dense vectors
- LSTM(64 units) → processes sequence in order
- Sigmoid → binary classification output
Compile the LSTM Model
Train the LSTM Model
LSTMs are slower than CNNs or MLPs, sequences must be processed step-by-step.
Typical accuracy: 88–92%.
Evaluate
Predicting Sentiment
Using GRU Instead of LSTM (Faster, Similar Accuracy)
GRUs are a simplified LSTM variant:
Train exactly the same way:
GRUs typically:
- Train faster
- Use fewer parameters
- Achieve similar accuracy
Using SimpleRNN (Not Recommended for Long Sequences)
SimpleRNN is a basic RNN layer. Just to show how, not practical for long text.
Training is identical, but accuracy will be much lower, especially on long sequences.
Bidirectional LSTM (Higher Accuracy)
The model reads the sequence forward + backward.
This often reaches 92–94% accuracy.
LSTM/GRU Dropout
RNN layers support two types of dropout:
This helps prevent overfitting.
Masking and Variable-Length Sequences
Keras can automatically skip padded values:
Example with masking:
Time Series with RNNs
Given a sliding window time series:














