Mean reversion – the idea that prices eventually return to an average value – is a classic trading concept. Traders often use tools like moving averages and Bollinger Bands to identify potential turning points. But what if the “mean” itself isn’t static or best represented by a simple average? What if we could use a more adaptive, data-driven curve as our baseline and trade deviations from it?
This is where Kernel Regression enters the picture, offering a sophisticated, non-parametric way to smooth price data and derive a dynamic “fair value” line. When combined with adaptive bandwidth selection, like Silverman’s Rule of Thumb, it forms the basis of an intriguing strategy: the Dynamic Kernel Bandwidth Reversion. Let’s explore how this works and what a preliminary backtest might look like.
At its heart, kernel regression doesn’t assume a specific shape (like a straight line or a fixed-degree polynomial) for the underlying trend in prices. Instead, it builds a smooth curve by looking at local “neighborhoods” of data points. The influence of each data point is weighted by a kernel function (often Gaussian, like a bell curve), and a critical parameter called bandwidth (h) controls how wide this neighborhood is, and thus, how smooth the resulting curve will be.
A key innovation in this strategy is making the bandwidth
dynamic. Instead of picking a fixed h
, we
can use methods like Silverman’s Rule of Thumb. This
rule estimates an optimal bandwidth based on the standard deviation and
the number of data points within a rolling window, allowing the
smoothness of our kernel curve to adapt as market volatility
changes.
Here’s a glimpse of how the kernel curve might be calculated in a rolling fashion, with Silverman’s rule determining the bandwidth for each window:
Python
# Excerpt from calculate_kernel_curve_and_bands function
# (Inside a loop iterating through historical data)
# window_data: df_calc['Close'].iloc[i - KERNEL_REG_WINDOW + 1 : i + 1]
# exog_time: np.arange(len(window_data))
# endog_price: window_data.values
# Silverman's Rule for bandwidth (h)
n = len(endog_price)
sigma = np.std(endog_price)
if sigma == 0 or n < 2: # Handle edge cases
# Fallback logic (e.g., use last price or skip)
# ... (for brevity, actual fallback might be more complex)
kernel_curve_values.append(endog_price[-1] if sigma == 0 else np.nan)
continue
silverman_bw_val = 1.06 * sigma * (n ** (-0.2)) # n^(-1/5)
# Ensure bandwidth is practical
min_practical_bw = 0.1 # Heuristic minimum
bw_to_use = [max(silverman_bw_val, min_practical_bw)]
try:
# kern_reg = KernelReg(endog=endog_price, exog=exog_time, var_type='c', reg_type='lc', bw=bw_to_use)
# pred_results = kern_reg.fit(exog_time)
# kernel_curve_values.append(pred_results[0][-1]) # Last point of the fitted curve
pass # Actual KernelReg call happens here as in the full script
except Exception as e:
kernel_curve_values.append(np.nan)
This dynamic kernel curve becomes our adaptive baseline, representing a data-driven “fair value.”
Once the kernel regression curve is established, the strategy builds trading bands around it:
Kernel_Curve
itself.Kernel_Curve +/- (Multiplier * Rolling_StdDev_of_Residuals)
.
The residuals are simply Price - Kernel_Curve
, so these
bands measure how far the price typically deviates from its smoothed
version.Lower_Band
. This
suggests the price is oversold relative to its dynamic kernel mean and
is expected to revert upwards.Upper_Band
. This
suggests an overbought condition, with an expectation of a downward
reversion.Kernel_Curve
, signaling the mean reversion has
occurred.The entry logic can be seen in this snippet from the backtesting function:
Python
# Excerpt from run_backtest function
# (Inside a loop, after fetching previous day's close and band values)
# close_prev = df['Close'].iloc[i-1]
# lower_band_prev = df['Lower_Band'].iloc[i-1]
# upper_band_prev = df['Upper_Band'].iloc[i-1]
if pd.notna(lower_band_prev) and close_prev < lower_band_prev:
df.loc[df.index[i], 'Signal'] = 1 # Signal for Long Reversion entry today
elif pd.notna(upper_band_prev) and close_prev > upper_band_prev:
df.loc[df.index[i], 'Signal'] = -1 # Signal for Short Reversion entry today
Trades are typically entered at the open of the day following the signal.
The user shared results from a backtest of this strategy on
EURUSD=X
from January 2020 to December 2024 (parameters
included a KERNEL_REG_WINDOW
of 40 days and a
BAND_MULTIPLIER
of 2.0). The outcome was:
104.18%
102.30%
62.50%
+0.19%
These specific results suggest a slight outperformance over buy-and-hold for this particular asset and period, with a respectable win rate characteristic of many mean-reversion systems that aim for smaller, more frequent gains. It’s crucial to remember this is just one test; performance can vary significantly with different assets, timeframes, and parameter settings.
The appeal of a Dynamic Kernel Bandwidth Reversion strategy lies in its potential for:
This sophistication comes with its own set of complexities:
KERNEL_REG_WINDOW
,
BAND_RESIDUAL_STDDEV_WINDOW
, and
BAND_MULTIPLIER
are critical tuning knobs. Silverman’s
Rule, while data-driven, is also a heuristic and its underlying
assumptions (like near-normality) may not always hold for financial
data.cv_ls
often available
in libraries like statsmodels
) could yield different
results but add further computation. The original prompt even hinted at
“ML-tuned bandwidth,” a highly advanced technique involving optimizing
bandwidth using machine learning against a performance metric.BAND_MULTIPLIER
for the deviation bands is key. Too narrow,
and you get too many false signals; too wide, and you miss
opportunities.The Dynamic Kernel Bandwidth Reversion strategy is a rich area for experimentation:
statsmodels
or explore research on ML-based bandwidth
tuning.BAND_MULTIPLIER
dynamically based on a broader market volatility measure like VIX.The Dynamic Kernel Bandwidth Reversion strategy offers an intriguing, adaptive approach to mean-reversion trading. By moving beyond static averages and embracing non-parametric smoothing with dynamic bandwidths, it seeks to better capture the ebb and flow of market prices. While computationally more demanding and requiring careful parameterization, it represents a sophisticated step in the continuous exploration of data-driven trading techniques. The provided code and initial results serve as a launchpad for traders and quants keen on delving into the nuanced world of adaptive “fair value” and statistical arbitrage.