Handle Numerical Variable Transformation

Feature-engine, a Python library for feature engineering

2 min read

Published Oct 3 2025

Feature EngineeringFeature-engineMachine LearningPandasPythonscikit-learnTransformers

Log Transformer

It applies the natural logarithm (base e) or the base 10 logarithm to numerical variables.

Reduce skewness (normalise distributions) - Many real-world variables are positively skewed (long tail to the right), like income, sales, or house prices. Applying log compresses large values and stretches small ones, making the distribution more symmetric.

Parameters:

base : 'e' for natural or '10' for base 10. Natural is default if the parameter is missing.
variables: Specify which columns to apply the transformer to, if missing it will apply to all numerical fields.

logt = LogTransformer(variables = ['LotArea', 'GrLivArea'])

Reciprocal Transformer

This technique applies the reciprocal transformation 1 / x to numerical variables.

Consider use when your data is right-skewed (most values are small, with a few very large values), the reciprocal transformation can pull in large values and stretch out small values, making the distribution more symmetric. It is useful when we have ratios, that is, values resulting from the division of two variables.

tf = ReciprocalTransformer(variables="sqrfootpercar")

Power Transformer

It applies power or exponential transformations to the numerical variable. As general guidance, if data is right-skewed (i.e. more observations around lower values), use exp <1. If data is left-skewed (i.e. more observations around higher values), use exp >1.

Parameters:

exp : the power (or exponent), default is 0.5.

pt = PowerTransformer(variables = ['Col4']

Box Cox Transformer

This transformer applies the following mathematical formula, note: the data must be positive for this transformer:

The Box Cox transformation is used to reduce or eliminate variable skewness and obtain features that better approximate a normal distribution.

bct = BoxCoxTransformer(variables=['Col6'])

Yeo Johnson Transformer

The Yeo-Johnson transformation is an extension of the Box-Cox transformation that is no longer constrained to positive values. In other words, the Yeo-Johnson transformation can be used on variables with zero and negative values as well as positive values. Its formula is: