YData Profiling

Pandas Basics

1 min read

Published Sep 29 2025, updated Oct 24 2025

PandasPython

YData Profiling (formerly Pandas Profiling) is a tool that automatically generates detailed exploratory data analysis (EDA) reports from a pandas DataFrame.

It provides summaries such as:

Data types and missing values
Descriptive statistics
Correlations between features
Duplicates, distributions, and warnings
Interactive visualisations

It’s a great first step after loading a dataset into pandas — saving you hours of manual exploration.

Installation

You can install it via pip:

pip install ydata-profiling

If you’re using Jupyter notebooks, restart the kernel after installing.

Basic Usage with Pandas

import pandas as pd

from ydata_profiling import ProfileReport

# Example dataset

df = pd.read_csv("data.csv")

# Create a basic profile report

profile = ProfileReport(df)

# for Jupyter display

profile.to_notebook_iframe()

Alternatively, export to an HTML file:

profile.to_file("data_profile.html")

Key Options in `ProfileReport`

minimal - Creates a lighter, faster report (no correlations, fewer visuals) eg. ProfileReport(df, minimal=True)
explorative - Adds interactive, advanced visuals eg. ProfileReport(df, explorative=True)
title - Sets a custom report title eg. ProfileReport(df, title="Customer Data Profiling")
samples - Number of samples shown per column eg. ProfileReport(df, samples={"head": 5, "tail": 5})
correlations - Controls correlation methods eg. ProfileReport(df, correlations={"pearson": {"calculate": True}})
config_file - Load a YAML config for advanced setups eg. ProfileReport(df, config_file="profile_config.yaml")