smooth.msdecompose

smooth.msdecompose(y, lags=[12], type='additive', smoother='lowess')

Multiple seasonal decomposition of time series with multiple frequencies.

This function performs classical seasonal decomposition for time series with multiple seasonal patterns (e.g., hourly data with daily and weekly seasonality, or daily data with weekly and yearly patterns). It extends the standard STL decomposition to handle multiple seasonal periods simultaneously.

The decomposition separates the time series into:

  • Trend: Long-term movement (captured via smoothing)

  • Seasonal components: One for each seasonal period in lags

  • Remainder (not explicitly returned but implied): y - trend - seasonals

Decomposition Method:

For additive decomposition:

\[y_t = \text{Trend}_t + \sum_i \text{Seasonal}_i(t) + \epsilon_t\]

For multiplicative decomposition:

\[y_t = \text{Trend}_t \times \prod_i \text{Seasonal}_i(t) \times \epsilon_t\]

Algorithm Steps:

  1. Log Transform (if multiplicative): Apply log to convert to additive form.

  2. Missing Value Imputation: Fill NaN using polynomial + Fourier regression.

  3. Iterative Smoothing: For each lag period (sorted ascending), apply smoother with window = lag period, extract seasonal pattern, remove seasonal mean.

  4. Trend Extraction: Final smoothed series is the trend.

  5. Initial States: Compute level and slope from trend for model initialization.

Smoother Types:

  • “ma”: Moving average with window = lag period. Fast but less flexible.

  • “lowess” (default): LOWESS smoothing. Robust to outliers.

  • “supsmu”: Friedman’s super smoother (uses LOWESS in Python).

  • “global”: Global linear regression with intercept and deterministic trend.

Parameters:
  • y (array-like) – Time series data to decompose. Can contain NaN values (will be imputed). Shape: (T,) where T is the number of observations.

  • lags (list or array, default=[12]) –

    Seasonal periods to extract. Examples:

    • [12]: Monthly data with yearly seasonality

    • [24]: Hourly data with daily seasonality

    • [7, 365.25]: Daily data with weekly and yearly seasonality

    • [24, 168]: Hourly data with daily (24h) and weekly (7×24=168h) patterns

    Must contain positive integers. Lags are sorted automatically.

  • type (str, default="additive") –

    Decomposition type:

    • ”additive”: Components are summed (for stable seasonality)

    • ”multiplicative”: Components are multiplied (for proportional seasonality, requires y > 0)

  • smoother (str, default="lowess") –

    Smoothing method for trend and seasonal extraction:

    • ”lowess”: LOWESS with adaptive span (recommended, default)

    • ”supsmu”: Super smoother (uses LOWESS in Python)

    • ”ma”: Simple moving average (faster but less robust)

    • ”global”: Global linear regression (straight line fit)

Returns:

Dictionary containing decomposition results with keys: 'states' (ndarray of shape (T, n_states) with level, trend, seasonals), 'initial' (dict with ‘nonseasonal’ and ‘seasonal’ initial values), 'trend' (ndarray of shape (T,) with trend component), 'seasonal' (list of ndarrays, one per lag, each centered at 0), 'component' (list of component descriptions), 'lags' (ndarray of sorted unique lag periods), 'type' (str, ‘additive’ or ‘multiplicative’).

Return type:

dict

Raises:
  • ValueError – If type not in [‘additive’, ‘multiplicative’] If smoother not in [‘ma’, ‘lowess’, ‘supsmu’]

  • ImportError – If smoother=’lowess’ or ‘supsmu’ but statsmodels is not installed

Notes

Missing Values:

NaN values are automatically imputed using a regression model:

\[\hat{y}_t = \sum_{k=0}^d \beta_k t^k + \sum_{j=1}^m \alpha_j \sin(\pi t j / m)\]

where d is polynomial degree (up to 5) and m is the maximum lag. This preserves trend and seasonal structure during imputation.

Multiplicative Decomposition:

Requires strictly positive data. If y ≤ 0, those values are treated as missing. Internally works on log(y), then exponentiates results.

Smoother Span Selection:

For LOWESS, span (bandwidth) is automatically selected based on lag period:

  • For lag = 1: span = 2/3 (R’s default)

  • For lag = T: span = 2/3

  • Otherwise: span = 1 / lag

  • Minimum span: 3 / T (ensures smoothness)

Seasonal Centering:

Each seasonal pattern is centered to have mean zero. This ensures identifiability: trend captures the level, seasonals capture deviations.

Performance:

  • Moving average: Very fast (~1ms for T=1000)

  • LOWESS: Moderate (~10-50ms depending on T)

  • Multiple lags: Time scales linearly with number of lags

Use in ADAM:

The decomposition is used for initial state estimation when initial=”backcasting” or when the model includes seasonal components. The extracted states provide reasonable starting values for the level, trend, and seasonal components.

Comparison to STL:

Unlike STL (Seasonal-Trend decomposition using Loess), which handles only one seasonal period, msdecompose handles multiple seasonal periods by iteratively removing each seasonal component.

See also

creator

Uses msdecompose results for initial state estimation

initialiser

May use decomposition results for parameter initialization

Examples

Decompose monthly data with yearly seasonality:

>>> y = np.array([112, 118, 132, 129, 121, 135, 148, 148, 136, 119, 104, 118,
...               115, 126, 141, 135, 125, 149, 170, 170, 158, 133, 114, 140])
>>> result = msdecompose(y, lags=[12], type='additive', smoother='lowess')
>>> print(result['trend'])  # Trend component
>>> print(result['seasonal'][0])  # Yearly seasonal pattern
>>> print(result['initial']['nonseasonal']['level'])  # Initial level
>>> print(result['initial']['nonseasonal']['trend'])  # Initial trend
>>> print(result['initial']['seasonal'][0])  # First 12 seasonal values

Decompose hourly data with daily and weekly seasonality:

>>> hourly_data = np.random.randn(24 * 7 * 4)  # 4 weeks of hourly data
>>> result = msdecompose(hourly_data, lags=[24, 168],  # 24h and 7*24h
...                      type='additive', smoother='lowess')
>>> daily_pattern = result['seasonal'][0]  # 24-hour pattern
>>> weekly_pattern = result['seasonal'][1]  # Weekly pattern

Multiplicative decomposition for positive data:

>>> sales = np.array([100, 120, 150, 140, 130, 160, 200, 210, 180, 140, 110,
130])
>>> result = msdecompose(sales, lags=[12], type='multiplicative')
>>> # Seasonality proportional to level

Use decomposition for ADAM initialization:

>>> result = msdecompose(y, lags=[12], type='additive')
>>> initial_level = result['initial']['nonseasonal']['level']
>>> initial_trend = result['initial']['nonseasonal']['trend']
>>> initial_seasonal = result['initial']['seasonal'][0]  # First 12 values
>>> # Pass to ADAM's initials parameter