Probability Distribution Functions - PMF, PDF & CDF
Maths: Statistics for machine learning
3 min read
Published Oct 22 2025, updated Oct 23 2025
Guide Sections
Guide Comments
A probability distribution describes how probabilities are assigned to possible values of a random variable.
In simple terms:
It tells you how likely different outcomes are.
There are two main types of random variables:
- Discrete → countable outcomes (e.g., rolling a die)
- Continuous → infinite outcomes within a range (e.g., height, time)
Different mathematical functions describe the probability behaviour for each type:
Type | Function | Description |
Discrete | PMF — Probability Mass Function | Probability for each specific outcome |
Continuous | PDF — Probability Density Function | Probability density over a range of values |
Both | CDF — Cumulative Distribution Function | Probability up to a certain value |

Probability Mass Function (PMF)
The PMF gives the probability of each discrete value of a random variable.
It applies to discrete data — outcomes you can count.

Example
Rolling a fair die:
X (Value) | 1 | 2 | 3 | 4 | 5 | 6 |
P(X=x) | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |
Each possible value has an equal probability, and all probabilities sum to 1.
Probability Density Function (PDF)
The PDF describes the probability density for continuous random variables.
You can’t assign probability to a single point (since there are infinitely many), but you can measure the probability of a range of values.

The area under the curve between a and b represents the probability.
Example
Heights of people in cm → continuous variable
If f(x) is the PDF, then:
- P(160 ≤ X ≤ 170) = area under curve between 160 and 170
- Total area under the curve = 1
- The height of the curve at each point shows density
Common PDFs:
- Normal (Gaussian) - Natural data like height, weight, errors
- Exponential - Time until an event occurs (e.g., waiting time)
- Uniform - Equal likelihood across a range
Cumulative Distribution Function (CDF)
The CDF gives the probability that a random variable is less than or equal to a value.

It’s the cumulative sum (for discrete data) or integral (for continuous data) of probabilities up to x.
Key Properties:
- Always increases from 0 → 1
- Smooth for continuous distributions
- Step-shaped for discrete distributions
Example
For rolling a die
x | 1 | 2 | 3 | 4 | 5 | 6 |
P(X ≤ x) | 1/6 | 2/6 | 3/6 | 4/6 | 5/6 | 1 |
The CDF tells us the chance the outcome is ≤ a given value. The curve starts at 0 and approaches 1, showing the accumulated probability up to each point.
How They Relate
Function | Works With | Represents | Key Feature |
PMF | Discrete data | Probability of exact value | Sum = 1 |
Continuous data | Probability density at a point | Area = 1 | |
CDF | Both types | Probability ≤ x | Always increasing 0 → 1 |
In Machine Learning
- PMF - Discrete models (e.g., categorical distributions, classification probabilities)
- PDF - Continuous probability models (e.g., Gaussian Naive Bayes, anomaly detection)
- CDF - Computing probabilities, quantiles, or thresholds (e.g., sigmoid activation behaves like a CDF)
- Distributions - Used for sampling, likelihood estimation, and probabilistic models (e.g., Bayesian networks, GANs, probabilistic regression)














