/home/agent-jay/claudeCode/jarvis/plans/berg-atomic-indicators.md

Berg Atomic Indicators — Plan

title: Milton Berg atomic indicators + combo engine

status: PHASE 3 SHIPPED — 4 live combos (C2/C4/C6/C10) in production, signals table + daily cron + dashboard panel. Post-2010 calibration only (G2=50%).

source: reference_milton_berg_indicator_catalog.md (Shapiro/Berg interview 2026-05-18 + X/Twitter expansion 2026-05-25)

owner: Jarvis (finance lab)

created: 2026-05-25

updated: 2026-05-25

Phase 3 (2026-05-25) — production pipeline for 4 live combos

SHIPPED. Commit: see feat(finance) below.

What shipped

finance/scripts/berg_create_schema.py — idempotent berg_signals table creation (PRIMARY KEY: signal_date, combo_id; forward-return columns fwd_5d/10d/20d/60d/120d/252d; max_dd_60d; next_neg10pct_days).
finance/scripts/berg_backfill_signals.py — historical backfill: 58 rows total (C2:5, C4:6, C6:17, C10:30).
finance/scripts/berg_signals_daily.py — daily runner (weekdays 5 PM ET via cron); queues Telegram alert if >=3 combos fire.
finance/tests/test_berg_signals.py — 6 tests, all passing.
Dashboard — /finance "Berg Signals" sub-tab at http://localhost:3850/finance#berg-signals. API: /api/finance/berg-signals.
Cron — berg-signals-daily at 0 22 1-5 (22:00 UTC = 5 PM ET). In sentinel/src/config.ts + dashboard/server.mjs STATIC_SCHEDULES. Handler in jarvis-bot/src/index.ts.

Caveat (prominent, repeated)

Post-2010 calibration only. Gate G2: 8/16 = 50% on full 1957-2026 precedent set. Atoms were tuned to 2010-2026 regime; pre-2003 dates fail due to market microstructure differences. Use live signals with care.

Deferred

Atom calibration (Rob deferred)
Blocked combos (C1/C3/C5/C7/C8/C9/C11) — require external feeds
Phase 4 retail rule backtest
B5 (cap-weighted new lows) — defer to Phase 1.5

Phase 0 (2026-05-26) — backfill + re-run G2

Backfill landed. Via python -m finance.scripts.fetch_ndx_universe --start 1957-01-01 --tickers '^SPX,^NDX,^IXIC,^RUT,^VIX,^SOX,IWM,SOXX,SMH,SPY,QQQ':

Ticker	History	Notes
SPX	1957-01-02 → 2026-05-22	17,465 rows; full Berg precedent range
NDX	1985-10-01 → 2026-05-22	10,240 rows
IXIC	1971-02-05 → 2026-05-22	13,941 rows; older NDX substitute
RUT	1987-09-10 → 2026-05-22	9,749 rows
VIX	1990-01-02 → 2026-05-25	9,166 rows; extended from 2010
SOX	1994-05-04 → 2026-05-22	8,068 rows; ^SOX index (SOXX ETF only from 2001)
IWM/SOXX/SMH/SPY/QQQ	extended	benchmarks/ETFs back to inception

G2 PROXY_MAP switched to indices (SPX/NDX/RUT/SOX/VIX) — atoms scale-invariant for rolling/relative measures.

Result: 8/16 = 50% (FAIL @ 80% threshold). Absolute matches went UP (7 → 8), but 9 newly-testable precedents yielded only 1 new MATCH (C6 1998-10-21). Diagnosis:

Calibration problem (premortem #3 confirmed). Atoms tuned to 2010-2026 regime; pre-2000 misses: C4 1979/1984/2003×2, C10 1985/1986, C2 1998-08-31. The market microstructure (volatility, volume scale, breadth thresholds) differs enough that fixed-percentile atoms don't reproduce Berg's named dates.
Proxy-swap regression. C2 2025-04-04 lost (QQQ V1 fired; NDX V1 fired 3 days later 2025-04-07, outside ±2d window). NDX composite volume differs from QQQ ETF volume — the index is the more correct measure per Berg's methodology, so this is a "now we know" not a regression.
Tolerance is not the lever. ±7d only recovers C2 2025-04-04 → 9/16 = 56%. Still below threshold.

Per hard exit condition: G2 has failed 1 of 2 allowed calibration passes. Next: calibration pass OR re-plan with Rob.

Goal

Reproduce Milton Berg's histogram-tail extreme-detection framework in finance/indicators/. End state: a daily scanner that fires when 3-4 atomic extremes align (combo C1-C11 + X-search extensions), with backtested forward-return distributions and Berg-style stops, surfaced on the dashboard.

Non-goal: clone all 30,000 of Berg's indicators or all 2,000 combos. We build the ~24 atoms and 11+ combos Berg explicitly named, then evaluate ROI before extending.

CURRENT STATE (2026-05-25)

Phase 1 status: core atoms shipped to finance/indicators/berg_atoms.py (24 functions, full AtomSpec registry, evaluator). Tests at finance/tests/test_berg_atoms.py — 33/33 green.

Confidence-weighted atom table (shipped registry)

Ordered HIGH → LOW. Rationale: build highest-confidence atoms first; each entry maps to one AtomSpec in BERG_ATOM_REGISTRY.

ID	Conf	Level	Category	Source	Notes
T1	0.99	HIGH	trend	Shapiro video	`close < SMA(250)` — pure regime filter
P8	0.98	HIGH	momentum	Shapiro video	Single-day gain ≥ +9%
S1	0.98	HIGH	streak	Shapiro video	≥10 consecutive up closes
S2	0.98	HIGH	streak	Shapiro video	≥10 consecutive down closes
V1	0.95	HIGH	volume	Shapiro video	5d volume = 200d high
V5	0.95	HIGH	volume	Shapiro video	Up day + volume +20%
P1	0.95	HIGH	momentum	Shapiro video	3d ROC ≤ −7%
P2	0.95	HIGH	momentum	Shapiro video	1d gain = 250d max
P3	0.95	HIGH	momentum	Shapiro video	10d gain = 180d max
P5	0.95	HIGH	momentum	Shapiro video	10d ROC ≥ +20% (SOX)
S5	0.95	HIGH	streak	Shapiro video	Up 8 of 9 (Russell)
S6	0.95	HIGH	streak	Shapiro video	Up 16 of 19 (Nasdaq)
B6	0.95	HIGH	breadth	Shapiro video	New 60d low
B7	0.95	HIGH	breadth	Shapiro video	New 30d closing high
P_ROC74	0.90	HIGH	momentum	Forward Guidance Feb 2024	5d ROC ≥ +7.4% — every bull since 1928
V2	0.90	HIGH	volume	Shapiro video	5d-avg vol = 375d high within 20d
P6	0.90	HIGH	momentum	Shapiro video	Drawdown from 252d high (param)
B_DA_SP600_50_1	0.85	HIGH	breadth	Berg X 2026	D/A 50:1 — 18 cases since 1995, 100% up 120d
T_RIN_HI	0.85	HIGH	breadth	Berg X / FG	NYSE TRIN ≥12.5 panic / ≥15.5 rare
B_AD_5D_087	0.75	MEDIUM	breadth	Berg X	5d A/D ≤0.87 at ATH — top precondition
VX_VIX_5_30	0.70	MEDIUM	volatility	Berg X	VIX 5d/30d mean ratio — compression/expansion
VX1	0.70	MEDIUM	volatility	Shapiro video	VIX intraday range 45-60
P7	0.70	MEDIUM	momentum	Shapiro video	Recovery thrust into new high
T2	0.70	MEDIUM	trend	Shapiro video	Days holding swing low
P4	0.65	MEDIUM	momentum	Shapiro video	10d ROC at 180d max (P3 companion, alt instrument)
VX2	0.60	MEDIUM	volatility	Shapiro video	VIX 5d/20d return-stdev ratio <0.87

Bold rows are X-search additions (2026-05-25) — not in original Shapiro video transcript.

Atoms NOT YET implemented (deferred to Phase 1.5)

ID	Reason deferred	Required data
V3	needs NASDAQ upvol/dnvol feed	Polygon `I:UPVOL.NQ` / `I:DNVOL.NQ`
V4	needs NYSE up-volume composite	Polygon `I:UPVOL.NY`
V6	duplicate of V5 with looser threshold	—
P9	needs NYSE up/down volume ratio	Polygon
S3, S4	duplicate of S1/S2 logic on S&P	trivial, add post-G2
B1, B2	need A/D ratio feeds	Polygon NYSE/NASDAQ breadth
B3	reverse breadth thrust — requires V4	depends on V4
B4	5d new highs − new lows on Nasdaq	Polygon breadth
B5	Berg's marquee innovation	needs daily shares-out for NYSE universe — Phase 0
VX3	trivial (`vix.close > 45`)	inline at combo level
SE1-SE6	sentiment data sources missing	manual scrape, Phase 0

Success criteria (gates)

Gate	Test	Pass
G1 — atoms compute	Each atom returns NaN-free series on SPY 2010→present	All 15 atoms produce series
G2 — Berg precedent match	Re-fire combos C1-C11 on the historical dates Berg cited	≥80% of Berg's dates flagged by our combo logic
G3 — forward-return profile	1-year median forward return on historical signals	≥+10%, drawdown ≤8% on aggregate
G4 — retail rule backtest	Long-or-flat S&P, buy 1st signal, trail −8%	CAGR ≥12% over 2010-2026 (vs Berg's claimed 18.5% over 1957-2026)

G2 is the critical gate. If we can't reproduce Berg's dates, our atoms are misdefined and the rest is theatre.

G2 result — 2026-05-25 (Phase 2 ship)

OVERALL: 7/7 = 100% on testable precedents — PASS (threshold 80%).

Live combos (4 of 11):

Combo	Matches	Notes
C2 (NDX panic + VIX 45-60)	2/2	2011-08-08 ✓, 2025-04-04 ✓
C4 (April 9 thrust day)	1/1	2025-04-09 ✓
C6 (held-low-9 + 4-way thrust)	2/2	2011-10-14 ✓, 2026-04-13 ✓
C10 (late entry, 16-of-19)	2/2	2023-11-22 ✓, 2026-04-27 ✓

Blocked combos (7 of 11) — pending Phase 0 data backfill:

Combo	Blocking atom(s)	Source needed
C1	V4 + B3	NYSE up/dn vol + A/D thrust
C3	SE4 + B4	NDR osc + NASDAQ 5d new-lows
C5	B1	NDR Multi-Cap A/D
C7	RUT history	yfinance ^RUT to 1987
C8	V3 + B2	Nasdaq up/dn vol + 10d A/D
C9	SOX history	yfinance SOXX/SMH
C11	SE1	IPO calendar

Bug found and fixed during G2

P6_HIST atom added (p6_drawdown_reached_within). Original P6 required drawdown to STILL be ≥10% on the bar — but Berg's verbal usage ("the market DECLINED 10%, held its low for 19 days") is a historical regime condition, not same-bar. C10 was 0/2 with current-bar P6; switched to P6_HIST (DD reached -10% anytime in trailing 60d) → 2/2.

Catalog date correction

C6/C7/C10 cite 2026-04-13 and 2026-04-27 as recent firings (not 2025 — interview was 2026-05-18 discussing the just-prior month's signals). My initial spec wrote 2025 dates; fixed to 2026 per catalog verbatim. Both dates now fire correctly.

Sample-size caveat

Only 7 of 33 named precedents are testable: research.db starts 2010 (SPY/VIX) / 2003 (QQQ) but Berg's named dates go back to 1957. Phase 0 backfill is needed to test the 1957-2003 precedents (1957-style turn, 1979 thrust, 1987 SOX, 1998 panic, 2008 capitulation).

G2 verdict authorizes proceeding to Phase 3 (dashboard + cron) for the 4 live combos, with Phase 0 (data backfill) in parallel to unlock blocked combos.

Data inventory (current state of `finance/research.db`)

Have ✓

Symbol	History	Source	Notes
SPY	2010-01 → today	yfinance	OHLCV daily
QQQ	2003-01 → today	yfinance	OHLCV daily
VIX	2010-01 → today	yfinance	OHLC daily — gives us VX1 (intraday H/L range)
NDX-100 components	2003 → today	yfinance	for cap-weighted breadth
MAG7 names	2003-2010 → today	yfinance
svix_compression.spx_close	2025-05 → today	svix pipeline	SPX close, no OHLC

Recent additions (backfill needed)

Symbol	Current rows	Action
SPX	5	Run `python -m finance.scripts.fetch_ndx_universe` full backfill — yfinance ^SPX from 1957 if available, 1990 minimum
IWM	5	Same — yfinance to inception (2000)
RUT	5	Same — yfinance to 1987
^NDX	5	Same — yfinance to 1985

Gaps (Phase 0 to fill)

Need	Why	Source candidates
SOXX or SMH	Combo C9 (SOX 20% / 10d)	yfinance SOXX since 2001, SMH since 2000
NYSE composite up-volume / down-volume	V4, V3 (volume thrust, panic)	Polygon `I:UPVOL.NY` / `I:DNVOL.NY` if subscribed; else Stooq `^NYUPV`/`^NYDNV`
NYSE composite new highs / new lows	B4, B5	Polygon or Stooq; manual scrape from WSJ
Daily shares outstanding for NYSE universe	B5 (mkt-cap-weighted new lows)	yfinance `Ticker(x).info["sharesOutstanding"]` weekly; SEC EDGAR quarterly
AAII / Investors Intelligence bull-bear	Substitute for Market Vane (SE2)	Stooq, manual CSV weekly

Substitutes (Berg → ours)

Berg	Our substitute	Rationale
NDR Multi-Cap A/D	Russell 3000 A/D from Stooq	Same broad-market spirit, free
NDR overbought/oversold oscillator	Williams %R or Stochastic on SPX	Public formula
Market Vane	AAII bullish %	Both retail sentiment, free weekly
Berg's market-cap-weighted new lows	NYSE universe via `prices_daily` join with shares-out	We rebuild from scratch

Phase 0 — data prep (½ day)

Run python -m finance.scripts.fetch_ndx_universe in full-backfill mode for SPX, IWM, RUT, NDX. Verify ≥15 years of history.
Add SOXX, SMH to the ingest universe in fetch_ndx_universe.py. Re-run.
New script finance/scripts/fetch_nyse_breadth.py — pull NYSE A/D, up-vol/down-vol, new highs/lows from Stooq. Store in new table nyse_breadth_daily(date, up_issues, down_issues, up_vol, down_vol, new_highs_52w, new_lows_52w).
New script finance/scripts/fetch_shares_outstanding.py — weekly yfinance info.sharesOutstanding for all NYSE+Nasdaq members in prices_daily. Store in new table shares_outstanding(ticker, date, shares). Forward-fill within week.
Verify gate G1 prerequisite: every atom has ≥10 years of input data available.

Exit criteria: all 15 atoms can compute on at least SPY 2010→present with no NaN gaps.

Phase 1 — atomic indicators (1 day)

New file: finance/indicators/berg_atoms.py

Pure pandas functions, one per atom, signature: def atom_v1(prices: pd.DataFrame) -> pd.Series: — input is OHLCV DataFrame indexed by date, output is boolean series (True = extreme condition met).

Volume atoms

def v1_ndx_5d_vol_200d_high(qqq: pd.DataFrame) -> pd.Series:
    """V1: 5-day rolling sum of volume is at its 200-day high."""
    v5 = qqq["volume"].rolling(5).sum()
    return v5 == v5.rolling(200).max()

def v2_nyse_5d_vol_375d_high_in_20d(spy: pd.DataFrame) -> pd.Series:
    """V2: SPY 5-day avg volume hit 375-day high anytime in past 20 days."""
    v5avg = spy["volume"].rolling(5).mean()
    is_high = v5avg == v5avg.rolling(375).max()
    return is_high.rolling(20).max().astype(bool)

def v3_ndx_10d_volume_thrust(qqq_up_vol, qqq_dn_vol) -> pd.Series:
    """V3: NASDAQ 10-day upside/downside volume ratio ≥ 1.89."""
    return (qqq_up_vol.rolling(10).sum() / qqq_dn_vol.rolling(10).sum()) >= 1.89

def v4_nyse_upvol_pct_below_45(up_vol, total_vol) -> pd.Series:
    """V4: NYSE 5-day upside vol as % of total 5-day vol < 45%."""
    return (up_vol.rolling(5).sum() / total_vol.rolling(5).sum()) < 0.45

def v5_signal_day_volume_up_20pct(prices: pd.DataFrame) -> pd.Series:
    """V5: Today's volume ≥ +20% vs yesterday AND close > open."""
    up_day = prices["close"] > prices["open"]
    vol_up = prices["volume"] >= 1.20 * prices["volume"].shift(1)
    return up_day & vol_up

Price-momentum atoms

def p1_nasdaq_3d_decline_7pct(prices) -> pd.Series:
    """P1: 3-day rate-of-change ≤ −7%."""
    return prices["close"].pct_change(3) <= -0.07

def p2_spx_1d_gain_250d_high(prices) -> pd.Series:
    """P2: today's 1-day pct change is the highest in 250 trading days."""
    chg = prices["close"].pct_change()
    return chg == chg.rolling(250).max()

def p3_nasdaq_10d_gain_180d_high(prices) -> pd.Series:
    """P3: 10-day pct change ranks #1 in trailing 180 days."""
    chg10 = prices["close"].pct_change(10)
    return chg10 == chg10.rolling(180).max()

# P4 identical to P3 but on S&P
# P5 = SOXX 10d return ≥ +20%
# P6 = drawdown_from_252d_high <= -0.05 (or -0.13)
# P7 = recovery thrust: (close - low_since_prior_-10pct) / low ≥ 6% AND close == max-since-low
# P8 = single-day gain ≥ +9% (S&P or Nasdaq)
# P9 = NYSE up-vol / total-vol on the day ≥ 100/101

Streak atoms

def s_run_length(series_bool: pd.Series) -> pd.Series:
    """Length of consecutive True-runs ending at each row."""
    grp = (series_bool != series_bool.shift()).cumsum()
    return series_bool.groupby(grp).cumsum()

def s1_vix_up10(vix) -> pd.Series:
    return s_run_length(vix["close"] > vix["close"].shift()) >= 10

def s5_russell_8of9(rut) -> pd.Series:
    up = (rut["close"] > rut["close"].shift()).rolling(9).sum()
    return up >= 8

def s6_nasdaq_16of19(qqq) -> pd.Series:
    up = (qqq["close"] > qqq["close"].shift()).rolling(19).sum()
    return up >= 16

Breadth atoms

def b1_ndr_multicap_ad_7d_ratio(ad_data) -> pd.Series:
    """B1: 7-day A/D ratio ≥ 2.10. Substitute = Russell 3000 advances/declines."""
    return (ad_data["adv"].rolling(7).sum() / ad_data["dec"].rolling(7).sum()) >= 2.10

def b4_nasdaq_5d_new_lows_minus_highs_minus_16(nh, nl) -> pd.Series:
    """B4: 5-day cumulative (new highs − new lows) ≤ −16."""
    return (nh.rolling(5).sum() - nl.rolling(5).sum()) <= -16

def b5_marketcap_weighted_new_lows(prices_panel, shares_panel) -> pd.Series:
    """B5: NYSE 52w-low mkt-cap as % of total NYSE mkt-cap.

    THE KEY BERG INNOVATION.

    For each date:
      1. Identify which NYSE-listed tickers are at 52-week low close.
      2. Sum their market caps (close * shares).
      3. Divide by sum of market caps of NYSE universe.
      4. Return % as scalar.

    Berg's table:
      1979 = 4.38%, 1980 = 4.15%, 1983 = 3.90% — none crashed.
      2026-05 = 3.12% — not extreme.

    Implementation note: NYSE universe must be filtered (not Nasdaq).
    Source: SEC EDGAR exchange membership.
    """
    # 252-day rolling min of close per ticker
    lo52w = prices_panel.groupby("ticker")["close"].transform(lambda x: x.rolling(252).min())
    is_low = prices_panel["close"] <= lo52w
    mcap = prices_panel["close"] * shares_panel  # joined
    daily = mcap.groupby("date").agg(
        total=("mcap", "sum"),
        at_low=("mcap", lambda s: s[is_low].sum()),
    )
    return (daily["at_low"] / daily["total"]) * 100  # percent

VIX atoms

def vx1_vix_range_45_60(vix) -> pd.Series:
    """VX1: VIX traded between 45 and 60 intraday (low ≥ 45 AND high ≤ 60 OR straddles)."""
    return (vix["high"] >= 45) & (vix["low"] <= 60) & (vix["high"] >= 45) & (vix["low"] < 60)

def vx2_vix_5d_20d_stdev_ratio_below_087(vix) -> pd.Series:
    """VX2: VIX 5-day stdev / 20-day stdev < 0.87 (compression)."""
    s5 = vix["close"].pct_change().rolling(5).std()
    s20 = vix["close"].pct_change().rolling(20).std()
    return (s5 / s20) < 0.87

Sentiment atoms

def se1_ipo_4y_high(ipo_dollars) -> pd.Series:
    return ipo_dollars == ipo_dollars.rolling(252 * 4).max()

def se4_obos_below_10(obos) -> pd.Series:
    """SE4: NDR-style overbought/oversold oscillator < 10. Substitute = Williams %R(14)."""
    return obos < 10

Trend filter

def t1_below_250dma(prices) -> pd.Series:
    return prices["close"] < prices["close"].rolling(250).mean()

def t2_days_holding_low(prices) -> pd.Series:
    """Return Series of int = days the swing low has held without violation."""
    # Identify swing low: most recent rolling-min close
    # Count consecutive days the low has not been broken
    rolling_min = prices["close"].rolling(60).min()
    is_low_day = prices["close"] == rolling_min
    days_since = (~is_low_day).groupby(is_low_day.cumsum()).cumsum()
    return days_since

Tests (in same commit)

finance/tests/test_berg_atoms.py:

Each atom: feed a synthetic series with the extreme baked in, assert atom fires.
Each atom: feed a flat/random series, assert atom does NOT fire.
B5: synthetic 100-ticker panel with 5 tickers at 52w low totaling 4% of market cap → assert result ≈ 4.0.

Phase 2 — combo engine (½ day)

New file: finance/indicators/berg_combos.py

from . import berg_atoms as a

COMBOS = {
    "C1": [a.v2_nyse_5d_vol_375d_high_in_20d, a.b6_spx_60d_new_low, a.p6_drawdown_10pct,
           a.v4_nyse_upvol_pct_below_45, a.b3_reverse_breadth_thrust],
    "C2": [a.v1_ndx_5d_vol_200d_high, a.p1_nasdaq_3d_decline_7pct, a.vx1_vix_range_45_60],
    "C3": [a.v2_nyse_5d_vol_375d_high_in_20d, a.se4_obos_below_10, a.b4_nasdaq_5d_nh_nl_minus_16],
    "C4": [a.t1_below_250dma, a.p7_recovery_thrust_6pct, a.p2_spx_1d_gain_250d_high, a.v5_signal_day_volume_up_20pct],
    "C5": [a.t2_held_low_7d, a.p7_recovery_9pct, a.b1_ndr_multicap_ad_7d_ratio],
    "C6": [a.t2_held_low_9d, a.b7_30d_new_high, a.p3_nasdaq_10d_gain_180d_high, a.p4_spx_10d_roc_180d_high],
    "C7": [a.t2_held_low_9d, a.b7_30d_new_high, a.p3_nasdaq_10d_gain_180d_high, a.p4_spx_10d_roc_180d_high, a.s5_russell_8of9],
    "C8": [a.t2_held_low_9d, a.p6_drawdown_13pct, a.v3_ndx_10d_volume_thrust, a.b2_nasdaq_10d_ad_130],
    "C9": [a.p5_sox_10d_20pct, a.b7_spx_2y_closing_high],
    "C10": [a.p6_drawdown_10pct, a.t2_held_low_19d, a.s6_nasdaq_16of19],
    "C11": [a.se1_ipo_4y_high, a.vx2_vix_compression],
}

def evaluate_combos(date_range, prices_dict) -> pd.DataFrame:
    """For each date, evaluate all combos. Return long-form DataFrame:
       date, combo_id, fired, atom_results (dict-of-bool).
    """
    ...

Backtest each combo

For every historical date a combo fires:

Record signal_date, combo_id, atoms-met dict, S&P close.
Record forward n-day-return for n in [5, 10, 15, 20, 60, 120, 252].
Record max_drawdown_60d after signal.
Record time_to_next_-10%_pullback.

Store in berg_signals(signal_date, combo_id, atoms_met_json, fwd_5d, fwd_10d, ..., fwd_252d, max_dd_60d, next_-10pct_days).

Validation against Berg's cited dates (gate G2)

Berg explicitly named these precedent dates per combo:

C1: 1957-10-21, 1962-06-26, 1976-09-22, 2018-12-24, 2025-04-08
C2: 1998, 2011, 2025-04-04
C3: 1998, 2018, 2020, 2025-04-07
C4: 1979, 1982, 1984, 2003×2, 2025-04-09
C5: 1974, 1998, 2002-10, 2009-03, 2011, 2025-04-30
C6: 1982-08-25, 1984-08-06, 1998-10-21, 2011-10-14, 2026-04-13
C7: 1982-08-23, 1998-10-21, 2026-04-13
C8: 2011-10-14, 2026-04-13
C9: 1987-10-04, 1997-05-05, 2026-04-17
C10: 1985-11-04, 1986-10-30, 2023-11-22, 2026-04-27
C11: 2014-11-21, 2020-05-29, 2020-06-05, 2026-05

Gate G2: For each combo, check that our atoms fire on at least 80% of Berg's named dates ±2 trading days. Where they don't, debug atom thresholds — Berg's exact thresholds may differ from his rounded verbal descriptions.

Phase 3 — daily scanner + dashboard (½ day)

Sentinel cron berg-signals-daily at 5:30 PM ET weekdays. Runs after fetch-prices-daily.
Output: row in berg_signals table per combo that fired, plus Telegram alert if ≥3 combos fire same day.
Dashboard panel /finance/berg-signals:

Today's fires (list of combo IDs, conditions met)
Historical precedent table per combo (Berg's named dates + ours)
Median forward-return curve overlay
Suggested stop (= signal close × (1 − max_historical_drawdown − 0.005))

Phase 4 — retail rule backtest (½ day)

finance/scripts/backtest_berg_retail.py:

state = "FLAT"
peak = -inf
for each day d:
    if state == "FLAT":
        if any combo fired on d:
            buy SPY at close(d)
            state = "LONG"; entry = close(d); peak = close(d)
    else:  # LONG
        peak = max(peak, close(d))
        if close(d) <= peak * 0.92:
            sell SPY at close(d)
            state = "FLAT"

Compare:

CAGR vs SPY buy-and-hold
Sharpe
Max drawdown
% time in market
# round trips per year

Gate G4: if CAGR ≥ 12% (vs SPY's ~10% over the same window) and max DD < SPY's, ship to dashboard.

Premortem — what kills this project?

Imagine it's 2026-08. The Berg replicator is built. What went wrong?

Atoms don't fire on Berg's dates (Gate G2 fails). Berg's verbal descriptions are rounded; his real thresholds are tighter. Result: combo signals look nothing like his. Mitigation: relax thresholds ±10% in a grid search, then re-tighten to the values that catch ≥80% of his dates. If still failing, contact Milton's retail product ($10/mo) and reverse-engineer from his published reports.

B5 (cap-weighted new lows) data is broken. Daily shares-outstanding from yfinance is unreliable for delisted historicals. Mitigation: CRSP would be authoritative but expensive. Quarterly EDGAR + forward-fill is OK. Validate B5 against Berg's published 1979/1980/1983 figures.

Combos fire too often. Histograms over our 2010-2026 sample have different tail thicknesses than Berg's 1957-2026 universe. Atoms calibrated to "rare" may be "common" in our window. Mitigation: require absolute thresholds, not percentile ranks. Validate on the 1990s if data permits.

Combos fire too rarely. Same data window problem the other way. Most combos signal 0-2 times. Mitigation: OK for backtest if we have Berg's named precedents. Live monitoring still useful.

Forward-return cherry-picking. Berg's claim of "120 days later, 100% up" across 5 historical instances is small-sample. With our combos we may see 5-10 instances → median forward-return is noise. Mitigation: report bootstrap CIs on forward returns. Don't claim signal until CI lower bound > 0.

Retail rule underperforms SPY (Gate G4 fails). Berg's 18.5% CAGR may rely on combos we can't reproduce. Mitigation: still ship the scanner; drop the retail-rule claim. The scanner has standalone value.

Data ingest fragility. NYSE breadth from Stooq is unofficial and may break. Mitigation: fall back to Polygon (paid) if Stooq breaks. Keep last 30 days cached in nyse_breadth_daily.

Drift over time. Berg's combos worked because he selected them post-hoc on the full history. Adding combos as we find them is fine; forward-validating against ≥1 new signal occurrence before trusting it is essential.

Most likely failure: premortem #1 (threshold mismatch). Plan for it upfront: keep atoms parameterized, calibrate to Berg's named dates as ground truth.

Open questions for Rob

Scope. Build all 11 combos, or start with the 5 highest-conviction (C1, C2, C4, C6, C10)? I'd vote start with 5 — faster G2 validation.
B5 priority. Ship B5 in Phase 1 (slower, needs shares-out data) or defer to Phase 1.5? I'd vote ship in Phase 1 — it's the marquee insight.
Data subscription. Pay for Polygon breadth feed (~$30/mo) for V3/V4/B4/B5, or commit to Stooq + scraping? I'd vote Polygon — we already use it.
Retail-rule backtest as gate. If our retail rule clears Gate G4 (≥12% CAGR), is it eligible for live paper-trading in the strategy lab? Default yes unless you object.
AAII vs Market Vane substitute. Acceptable? AAII is free, weekly, ~similar.
Time budget. 2.5 days of focused work end-to-end (Phase 0–4). OK to commit, or want to scope down to Phase 0–1 first and reassess?

Estimated effort

Phase	Work	Wall time	Risk
0	Data backfill + new feeds	½ day	low
1	`berg_atoms.py` + tests	1 day	low
2	`berg_combos.py` + backtest	½ day	high (G2 gate)
3	Cron + dashboard panel	½ day	low
4	Retail-rule backtest	½ day	medium
Total		~3 days

Hard exit conditions

Stop and re-plan with Rob if:

Gate G2 fails after 2 calibration passes (atoms don't reproduce Berg's dates).
B5 data pipeline can't produce trustworthy shares-out within Phase 0.
Phase 1 takes >2 days — means the atom list is creeping; cut scope.