lowess

LOWESS (Locally Weighted Scatterplot Smoothing) for robust nonparametric regression.

smooth.lowess(x, y=None, f=0.6666666666666666, iter=3, delta=None)

LOWESS smoother that exactly matches R’s stats::lowess function.

Performs locally weighted polynomial regression using Cleveland’s LOWESS algorithm. This implementation produces results identical to R’s stats::lowess function.

Parameters:

x (array-like) – The x values for the data. Can also be a 2D array or list with two elements, in which case the first column/element is used as x and second as y.
y (array-like, optional) – The y values for the data. If None, x must contain both x and y values as a 2D array or a list/tuple of two arrays.
f (float, default=2/3) – The smoother span. This gives the proportion of points in the plot which influence the smooth at each value. Larger values give more smoothness.
iter (int, default=3) – The number of robustifying iterations which should be performed. Using smaller values of iter will make lowess run faster.
delta (float, optional) – Values within delta of each other are treated as being at the same point. If None (default), uses 0.01 * (max(x) - min(x)), matching R’s default.

Returns:

A dictionary with two keys matching R’s lowess output: - ‘x’: The sorted x values - ‘y’: The smoothed y values corresponding to the sorted x values

Return type:

dict

Notes

This function is a direct port of R’s stats::lowess, which implements Cleveland’s (1979) LOWESS algorithm. The algorithm uses locally weighted polynomial regression with a tricube weight function and iterative reweighting for robustness to outliers.

The function returns results in the same format as R’s lowess: a list/dict with ‘x’ and ‘y’ components, where x is sorted and y contains the corresponding smoothed values.

References

Cleveland, W.S. (1979) “Robust Locally Weighted Regression and Smoothing Scatterplots”. Journal of the American Statistical Association 74(368): 829-836.

Overview

LOWESS is a nonparametric regression method that combines polynomial regression with local weighting. It is particularly useful for:

Smoothing noisy data while preserving local patterns
Robust estimation that is resistant to outliers
Exploratory data analysis to reveal underlying trends

The implementation exactly matches R’s stats::lowess function, ensuring reproducibility across R and Python workflows.

Example Usage

Basic smoothing:

from smooth import lowess
import numpy as np

# Generate noisy data
x = np.linspace(0, 2*np.pi, 50)
y = np.sin(x) + np.random.randn(50) * 0.3

# Apply LOWESS smoothing
result = lowess(x, y)

# Access smoothed values
x_smooth = result['x']  # Sorted x values
y_smooth = result['y']  # Smoothed y values

Adjusting smoothness:

# More smoothing (larger span)
result_smooth = lowess(x, y, f=0.8)

# Less smoothing (smaller span)
result_rough = lowess(x, y, f=0.2)

Handling outliers:

# Add outliers
y_outliers = y.copy()
y_outliers[10] = 5  # Outlier

# LOWESS is robust to outliers due to iterative reweighting
result = lowess(x, y_outliers, iter=3)  # Default iterations

# More iterations for heavily contaminated data
result_robust = lowess(x, y_outliers, iter=5)

Using 2D input (R-style):

# Combine x and y into 2D array
xy = np.column_stack([x, y])

# Call with single argument
result = lowess(xy)

Parameters

xarray-like: X values. Can be 1D array or 2D array with x in first column. (required)
yarray-like, optional: Y values. Optional if x is 2D array containing both x and y. Default: None
ffloat, optional: Smoother span (fraction of points). Larger values = smoother. Default: 2/3
iterint, optional: Number of robustifying iterations. More = more robust. Default: 3
deltafloat, optional: Distance threshold for interpolation. Points within delta are treated as the same point. Default: 0.01 * range(x)

Returns

The function returns a dictionary with two keys:

xndarray: Sorted x values.
yndarray: Smoothed y values corresponding to sorted x.

Algorithm

LOWESS uses Cleveland’s (1979) algorithm:

Local Fitting: At each point, fit a weighted linear regression using nearby points. Weights decrease with distance using a tricube function.
Robustness Iterations: Recompute weights based on residuals to downweight outliers. Repeat iter times.
Interpolation: For efficiency, only compute fits at a subset of points and interpolate between them (controlled by delta).

The tricube weight function is:

\[w(u) = (1 - |u|^3)^3 \quad \text{for } |u| < 1\]

References

Cleveland, W.S. (1979) “Robust Locally Weighted Regression and Smoothing Scatterplots”. Journal of the American Statistical Association 74(368): 829-836.