Description
When using TimesFm.forecast_with_covariates() to forecast a time series with only dynamic numerical covariates, the forecast result for a given input changes depending on whether it is passed alone or as part of a batch with another input.
This violates forecast isolation and breaks backtesting workflows, as the same input yields inconsistent outputs depending on which other inputs are batched with it. This may also be the cause of #262.
Environment
- Model:
google/timesfm-2.0-500m-pytorch
- Interface:
forecast_with_covariates
- Mode:
xreg + timesfm
- Covariates: only dynamic numerical are enough to reproduce
Minimal Working Example
import numpy as np
import timesfm
model = timesfm.TimesFm(
hparams=timesfm.TimesFmHparams(
backend='cpu',
per_core_batch_size=32,
horizon_len=128,
num_layers=50,
use_positional_embedding=False,
context_len=512,
),
checkpoint=timesfm.TimesFmCheckpoint(
huggingface_repo_id="google/timesfm-2.0-500m-pytorch"
),
)
context_len = 120
horizon_len = 24
total_len = context_len + horizon_len
np.random.seed(42)
ts1 = np.linspace(10, 20, total_len) + np.random.normal(0, 0.1, total_len)
ts2 = np.linspace(0, 10, total_len) + np.random.normal(0, 0.1, total_len)
cov1 = ts1 + np.random.normal(0, 0.2, total_len)
cov2 = ts2 + np.random.normal(0, 0.2, total_len)
# Forecast ts1 on its own
out_single, _ = model.forecast_with_covariates(
inputs=[ts1[:context_len].tolist()],
dynamic_numerical_covariates={"gen_forecast": [cov1[:total_len]]},
dynamic_categorical_covariates={},
static_numerical_covariates={},
static_categorical_covariates={},
freq=[0],
xreg_mode="xreg + timesfm",
)
# Forecast ts1 batched with ts2
out_batch, _ = model.forecast_with_covariates(
inputs=[ts1[:context_len].tolist(), ts2[:context_len].tolist()],
dynamic_numerical_covariates={
"gen_forecast": [cov1[:total_len], cov2[:total_len]]
},
dynamic_categorical_covariates={},
static_numerical_covariates={},
static_categorical_covariates={},
freq=[0, 0],
xreg_mode="xreg + timesfm",
)
diff = np.abs(np.array(out_single[0]) - np.array(out_batch[0]))
max_diff = np.max(diff)
print("Max difference between single and batched forecast for ts1:", max_diff)
assert np.allclose(out_single[0], out_batch[0], atol=1e-6), "Forecast changed when batched!"
Hypothesis
The issue likely originates in xreg_lib.BatchedInContextXRegLinear.fit() or its covariate preprocessing logic, where:
- Features may be transformed or scaled using batch-wide stats
- Linear model coefficients are affected by the presence of additional examples, even when inputs are fully independent
- The model is expected to fit independently per input in the batch, but does not in practice
Suggested Fix
Ensure per-input independence during the entire covariate preprocessing and linear model fitting pipeline, e.g., by fitting each xreg model independently, or scoping transforms (normalisation, encoding) strictly per sample.
Happy to help further debug or test potential fixes. Thanks for your work on this amazing model!
Christoph
Description
When using
TimesFm.forecast_with_covariates()to forecast a time series with only dynamic numerical covariates, the forecast result for a given input changes depending on whether it is passed alone or as part of a batch with another input.This violates forecast isolation and breaks backtesting workflows, as the same input yields inconsistent outputs depending on which other inputs are batched with it. This may also be the cause of #262.
Environment
google/timesfm-2.0-500m-pytorchforecast_with_covariatesxreg + timesfmMinimal Working Example
Hypothesis
The issue likely originates in
xreg_lib.BatchedInContextXRegLinear.fit()or its covariate preprocessing logic, where:Suggested Fix
Ensure per-input independence during the entire covariate preprocessing and linear model fitting pipeline, e.g., by fitting each xreg model independently, or scoping transforms (normalisation, encoding) strictly per sample.
Happy to help further debug or test potential fixes. Thanks for your work on this amazing model!
Christoph