VX3-VX1 Term Structure Inversion — McMillan / @TailThatWagsDog
Source: @TailThatWagsDog (Lawrence McMillan framing) ·
Signal: VX3 < VX1 (inversion / backwardation) ·
Backtested: 2026-05-22 · Proxy: ^VIX > ^VIX3M (VXMT) ·
Script: finance/vx_term_structure_backtest.py
The claim: When the 3rd-month VIX future is below the 1st-month (term structure in backwardation /
inverted), it signals elevated systemic stress and bearish forward returns for equities. This is the
"tail wagging the dog" — futures market pricing fear ahead of spot.
Verdict: NOT wired as an active signal.
Backtest (2016–2025, 2,514 trading days) shows the claim does not reproduce.
Inverted regime is actually followed by higher SPY forward returns at every horizon (mean-reversion /
bounce effect). Edge threshold of ≥5% difference was not met. Signal remains a useful regime
context indicator but is not actionable as a bearish trigger.
The Signal
- Condition: VX3 (3rd-month VIX future) < VX1 (front-month VIX future)
- Interpretation: Near-term fear exceeds 3-month expected vol — market pricing acute short-term stress
- Claimed direction: Bearish for SPY over 1–21 day horizon
- Proxy used in backtest:
^VIX > ^VIX3M (CBOE VXMT, 93-day expected vol)
"When the spot or near-term implied vol exceeds longer-dated vol, the options market is pricing more
fear for the near term than the long term — the definition of acute stress."
Backtest Results (2016–2025)
Regime Frequency
| Metric | Value |
| Total days tested | 2,514 |
| Days inverted (VIX > VIX3M) | 191 (7.6%) |
| Days in contango (normal) | 2,323 (92.4%) |
| Distinct inversion episodes | 57 |
| Avg episode duration | 4.6 calendar days |
| Max episode duration | 60 calendar days (COVID crash 2020) |
Forward SPY Returns by Regime
| Horizon | Regime | N | Mean % | Median % | Win % | Edge (Inverted − Contango) | Signal? |
| 1d | Inverted | 191 | +0.12 | +0.26 | 57.1 |
+0.07 | NO |
| 1d | Contango | 2,322 | +0.06 | +0.07 | 55.4 |
— | — |
| 5d | Inverted | 191 | +0.92 | +1.50 | 65.4 |
+0.66 | NO |
| 5d | Contango | 2,318 | +0.26 | +0.45 | 62.1 |
— | — |
| 10d | Inverted | 191 | +1.50 | +2.79 | 69.6 |
+0.96 | NO |
| 10d | Contango | 2,313 | +0.55 | +0.85 | 66.5 |
— | — |
| 21d | Inverted | 191 | +3.37 | +4.06 | 73.3 |
+2.22 | borderline |
| 21d | Contango | 2,302 | +1.15 | +1.83 | 70.3 |
— | — |
Edge threshold: ≥5% mean difference to qualify as actionable signal. None of the horizons reach this bar.
Max Drawdown by Regime
| Regime | Max Drawdown (SPY) | Interpretation |
| Inverted | -73.12% | COVID crash / 2018 Q4 — co-occurrence with stress events |
| Contango | -15.05% | Normal correction territory |
Why The Claim Doesn't Hold (Mechanically)
- Inversions are spike-driven and brief (avg 4.6 days) — they tend to coincide with the
sharpest vol spikes, which often precede hard bounces. Mean-reversion dominates the forward-return average.
- The bearish risk is in the tails — the -73% max drawdown captures the worst 60-day inversion
(COVID). This co-occurrence with crashes is real, but it doesn't translate into a mean-return edge.
- Sample size is thin — 191 inversion days across 57 episodes over 10 years. Marginal distributions
are noisy.
Data proxy caveat: ^VIX3M (CBOE VXMT) is the 93-day constant-maturity expected vol index,
not the actual VX3 futures contract. The two can diverge due to futures roll premium. Actual CBOE VX1/VX3 futures data
(available via CBOE Data Vault or Quandl) would give a cleaner test of the original claim. This backtest is a
conservative, accessible proxy — not the definitive test.
What Is Useful (Context, Not Trigger)
- Term structure inversion is a regime marker for elevated stress — useful as dashboard context
alongside Thrasher/Vixologist, not as a standalone trading signal.
- The drawdown asymmetry is real — during inversions, tail risk is significantly elevated even if
the mean return is higher. Inversions are worth noting when they persist (≥5 days).
- A multi-factor approach (inversion + Thrasher compression + Vixologist squeeze all firing
simultaneously) would be more defensible than any single signal alone.
Status & Next Steps
| Action | Status |
| Backtest on 2016–2026 proxy data | Done (2026-05-22) |
| Wire as active sentinel signal | Skipped — signal does not hold |
| Surface on dashboard alongside Thrasher/Vixologist | Skipped — no edge |
| @TailThatWagsDog wired to X_SOURCES watchlist | Done (low-freq) |
| Retest with actual VX1/VX3 CBOE futures data | P3 — if CBOE data access added |
| Explore as co-confirming factor with Thrasher + Vixologist | P3 — multi-factor only |
Backtest script: finance/vx_term_structure_backtest.py ·
Memory file: .claude/agent-memory/jarvis/reference_vx3_vx1_mcmillan.md ·
Per lesson_thrasher_backtest_edge_decay: paper claims often don't reproduce.