Quant · ML · AutomationUS Equities

An end-to-end
algorithmic trading
system.

From raw market data to live broker orders — a self-built platform that finds breakout setups across the US market, filters them with machine learning, sizes each trade by expected value, and executes through Interactive Brokers, fully automated and risk-first.

0
step daily pipeline
0
EV-based risk tiers
0%
out-of-sample model accuracy
0
integrated layers
The architecture

Six layers, one source of truth

Every layer reads from and writes to a single Postgres / Supabase database — no component hands trade data to another in memory, so any piece can fail, be skipped, or be re-run on its own.

01
Market data
Daily OHLCV + indices pulled from Polygon into the database.
02
Features
SQL jobs build momentum, volatility, volume & regime features.
03
Scanner
Detects rally → consolidation → breakout geometry.
04
ML scoring
Side-aware model adds confidence, expected value & a risk tier.
05
Execution
Auto-trader places IBKR brackets; breakeven & day-2 trailing.
06
Observe
React journal, Streamlit dashboards, email & an options overlay.
The edge

Find the breakout. Let ML decide the bet.

The strategy hunts one repeatable shape, then turns a chart pattern into a probabilistic, risk-sized decision.

Rally → Consolidation → Breakout

A momentum-scored rally, a tight base that rests at least 1.5× as long, then a confirmed breakout sets the entry, stop and target straight from the candle.

Expected value → risk tier
EV = (confidence × R:R) − ((1 − confidence) × 1R)
ULTRA PREMIUM
2.5%
EXCELLENT
2.0%
VERY GOOD
1.5%
GOOD
1.0%
DECENT
0.5%

Higher expected value earns a bigger slice of account risk. Negative-EV setups are skipped entirely — size is computed, never guessed.

Anatomy of a trade

Where the stop and targets come from

Nothing is hand-placed. The breakout sets the entry; the stop and every target are derived from the pattern's own geometry — the size of the move and the stock's recent volatility — so each signal arrives with a known risk and reward before a cent is committed. The short side is the mirror image: a selloff, a base, then a breakdown.

Long — breakout
Target +1.0×MH+0.75×MHTarget +0.5×MHEntry (breakout)Stop (−2×ATR)consolidationrallymoveheightriskreward
Short — breakdown
Stop (+2×ATR)Entry (breakdown)Target −0.5×MH−0.75×MHTarget −1.0×MHconsolidationselloffmovedepthriskreward
The stop — volatility-aware
Set at entry − 2 × ATR, where ATR is measured over the whole rally + consolidation window (capped at 50 days), not a fixed 14. A quiet base risks less; a wild one risks more. Falls back to the consolidation low; shorts mirror it above.
The targets — projected from the move
move size = rally high − rally low (and its mirror for shorts). Targets sit at 50% / 75% / 100% of that move — above entry for longs, below for shorts. The default booked target is 50%, stretching to 200% when the breakout candle closes strong (past the 0.618 fib). All three partials are tracked for scaling out and the day-2 trail.
Risk : reward — fixed at entry
R:R = (target − entry) / (entry − stop). Because both ends come from the chart itself, every signal carries a known R:R up front — and that number feeds straight into the expected-value and position-size math above.
The opposing book · ML2

When the model bets against the breakout

A rejected signal isn't always a non-event. When the model is confident a breakout will fail, the system takes the other side — a mean-reversion trade. It waits for the failing breakout to run most of the way to its target, then fades it back.

primary target (breakout goal)primary entryOpposing stop · the 50% lineOpposing short · on reversal alertOpposing target · toward original stop2-bar reversal alertfade ↓failing breakouttime →

Take a long breakout the model expects to fail. Rather than buy it, the opposing book waits for price to climb 75% of the way to the primary's target — where the breakout looks “successful” — and shorts it there, betting on the snap-back.

  • Gated by a 2-bar reversal alert — no alert, no opposing trade
  • Stop sits at the 50% (target-50) line; the take-profit targets the original signal's stop — a mean-reversion bracket
  • Dual entry: a reference 75%-to-target limit plus the alert level; one entry is live at a time
  • Each leg sized to 1R off its own fill
Long primary → opposing short
63.8%
opposing win
+0.23R
per trade
340
trades
Short primary → opposing long
73.9%
opposing win
+0.39R
per trade
226
trades

Real results from the system's own backtest_opposing_entry.py, replayed on 1-hour bars over the held-out test set, on the flipped population (primaries the model scored below 0.50). The backtest models the original-entry + alert legs (its current product); live execution layers the 75% leg on top. Expectancy in R, before costs.

How the models are built

Training methodology

Every closed trade becomes a labelled example. Two models, two objectives, one discipline: nothing is ever scored on data it trained on.

01
Label
Each closed historical trade is tagged win or loss based on whether price hit the target before the stop. Labels come straight from the scanner's own bracket geometry — same rules as live trading.
02
Split
A deterministic train/test split keeps held-out rows completely untouched during training. Long: 2,098 train / 700 test. Short: 1,491 train / 498 test.
03
Tune → train → evaluate
Optuna searches hyperparameters on the train set only. The best trial is retrained on the full train set. Metrics and calibration curves are computed once, on the test set.
Production model config — from the pickles
LONG side modelexp 155
promoted May 2026
2,098
train rows
700
test rows
44
features
Optuna objective:precision_weightedCV ≈ 0.601
Tuned hyperparameters
n_estimators702
max_depth9
learning_rate0.103
subsample0.966
colsample_bytree0.879
reg_α4.22
reg_λ1.27
min_child_weight2
γ4.09
scale_pos_weight1.07
SHORT side modelexp 147
promoted Apr 2026
1,491
train rows
498
test rows
20
features
Optuna objective:neg_brier_scoreCV Brier ≈ −0.222
Tuned hyperparameters
n_estimators405
max_depth6
learning_rate0.124
subsample0.870
colsample_bytree0.957
reg_α2.21
reg_λ2.61
min_child_weight2
γ3.01
scale_pos_weight1.47
Key decisions
Gradient-boosted trees
Both models are XGBoost — ensembles of decision trees, each correcting the last. The long model settled on 702 trees at depth 9; the short on 405 at depth 6.
Optuna hyperparameter search
A Bayesian (TPE) sampler searches tree count, depth, learning rate and regularisation, scored by stratified K-fold cross-validation. Seeds are fixed, so every run reproduces.
A different goal per side
The long model is tuned for precision-weighted score — be right when it fires (CV ≈ 0.60). The short is tuned for Brier score — probability calibration (CV Brier ≈ 0.22). Different risks, different objectives.
Class-imbalance reweighting
Wins are rarer than losses, so scale_pos_weight = losses ÷ wins lifts the minority class — the model learns to find winners instead of just predicting the common outcome.
Recursive feature elimination
Training starts with every candidate, drops the weakest by gain, re-tunes, and repeats — landing on 44 features for long and a leaner 20 for short.
Untouched held-out test
A deterministic split keeps test rows out of training entirely: 2,098 train / 700 test (long), 1,491 / 498 (short). Every metric and calibration curve here is measured only on those unseen rows.
Experiment tracking & promotion

Every training run is logged as an experiment (long model reached exp 155 across ~27 tracked runs; short reached exp 147 across ~20). A model is only promoted to production after it clears the held-out test set — the same deterministic split every time. Once live, the active model IDs are written to ACTIVE_MODELS.json so the pipeline and the auto-trader always serve the current version without a code change.

Live in production

Two side-aware models,
measured out of sample

Long and short setups are scored by separate models — promoted only after clearing a held-out test set the model never saw in training. These are the production evaluation metrics of the two models trading today: not curve-fit training scores, but honest out-of-sample performance on unseen trades.

LONG side model
promoted May 2026
LONG_SIDE_MODEL · exp 155
Accuracy
62.9%
ROC-AUC
0.65
Precision
61.6%
Recall
60.5%
F1
0.61
SHORT side model
promoted Apr 2026
SHORT_SIDE_MODEL · exp 147
Accuracy
65.7%
ROC-AUC
0.71
Precision
56.2%
Recall
68.2%
F1
0.62

The short model leans toward recall (0.682) — it casts a wider net to catch more breakdowns — while the long model is balanced near 0.61 across the board. AUC above 0.5 confirms both rank true setups ahead of false ones; that ranking is exactly what the EV and risk-tier layer above turns into position size.

What each model sees

Different inputs for long and short

The two models don't share a feature set. The long model reads 44 SQL-built features — adding VIX moves, weekly sector-relative returns, market-cap ratios and finer volume detail. The short model trades on a leaner 20, plus its own pattern_score input. 19 features are common to both. Toggle to compare.

shared (19)long only (25)
Pattern geometry
10
basic_tightnessconsol_durationrally_pctrally_durationrally_intensityconsol_end_close_pctconsol_rally_duration_ratioconsol_bearish_ratiocandles_close_above_entrymomentum_second_half
Relative strength & market
13
stock_vs_sector_rally_strengthstock_vs_qqq_rally_strengthsector_vs_market_strengthqqq_consol_vs_rally_pctsector_vs_spy_weekly_ret_w1sector_vs_spy_weekly_ret_w2sector_vs_spy_weekly_ret_w3sector_vs_spy_weekly_ret_w4sector_etf_weekly_oc_atr_w2sector_etf_weekly_oc_atr_w3sector_etf_weekly_oc_atr_w4market_cap_to_sp500market_cap_to_industry_index
Volatility & VIX
6
atr_pct_14volatility_20dbreakout_day_atr_multipledist_from_sma_20_atr_normvix_change_1dvix_change_5d
Volume
5
rally_avg_volume_to_breakout_ratioconsol_volume_pctrally_volume_pctconsol_avg_volume_to_breakout_ratiobreakout_range_to_consol_avg_range_ratio
Momentum & trend
6
trend_strength_20dist_from_sma_5fib_retracement_zonersi_14stoch_14dist_from_sma_20
Level interaction & seasonality
4
old_entry_touches_ratiostop_touches_ratioentry_monthold_entry_touches
Feature importance

How much each feature contributes

Gain-based importance shows how much each feature improved the model's splits. The two models could hardly be more different: the long model spreads its conviction across dozens of weak signals (nothing tops ~4%), while the short model leans hard on a few — the size of the prior rally alone drives 11.7%. Hover a name for what it measures.

All 20 — concentrated in a few drivers.
1
rally_pct
11.7%
2
stop_touches_ratio
7.8%
3
basic_tightness
6.1%
4
entry_month
6%
5
rally_duration
5.9%
6
stock_vs_qqq_rally_strength
5.3%
7
sector_vs_market_strength
5.1%
8
fib_retracement_zone
4.9%
9
qqq_consol_vs_rally_pct
4.9%
10
trend_strength_20
4.6%
11
old_entry_touches_ratio
4.5%
12
atr_pct_14
4.5%
13
stock_vs_sector_rally_strength
4.4%
14
consol_end_close_pct
4.4%
15
rally_avg_volume_to_breakout_ratio
4.2%
16
dist_from_sma_5
4.1%
17
rally_intensity
4%
18
volatility_20d
3.9%
19
pattern_score
3.8%
20
consol_duration
0%

Gain-based importance from the production model (20 features, bars on a shared 0–12% scale). consol_duration lands at 0% — basic_tightness already encodes the base length, so the tree never needs it.

Calibration

Does a confidence of 0.6 really win 60%?

A score is only useful if it means what it says. These reliability curves plot predicted confidence against the win rate actually realised, across the full confidence range on each model's held-out test set — every point on the dashed line is perfectly calibrated. Both models track the diagonal closely: a 0.6 really does mean roughly 60%.

Long modelexp 155
700 held-out trades
0%0%25%25%50%50%75%75%100%100%36.3%37.3%57.6%67.9%Predicted confidenceActual win rate
Short modelexp 147
498 held-out trades
0%0%25%25%50%50%75%75%100%100%Predicted confidenceActual win rate

Win rate measured on the production models' own held-out test sets — 700 unseen trades for the long model (exp 155), 498 for the short (exp 147) — bucketed by predicted confidence. Points are sized by sample count, so the sparse tails (very low or very high confidence, where few setups land) carry visibly less weight than the dense middle. Hover any point to see how many trades fall in that bucket. Live trading only acts on the calibrated ≥ 0.50 range.

The bottom line

Is it actually profitable?

Results are in R — multiples of the risk taken on each trade (R = |entry − stop|). A stop-out is −1R; a winner pays its reward-to-risk. On its own the breakout scanner has a negative expectancy; the machine-learning filter is what turns it positive. Move the confidence filter to see the quality-vs-quantity tradeoff. Every figure is from each model's held-out test set — trades it never saw — equal-weighted, before costs.

ML confidence filter
production gate
Long side
exp 155
Scanner only
Win rate
48.1%
Expectancy-0.04R
Net (sum of R)-27.4R
Trades700
Scanner + ML ≥0.50
Win rate
61.6%
Expectancy+0.12R
Net (sum of R)+39.5R
Trades331
Win rate +13.5 ptsExpectancy negative → positive
Short side
exp 147
Scanner only
Win rate
40.4%
Expectancy-0.12R
Net (sum of R)-60.6R
Trades498
Scanner + ML ≥0.50
Win rate
56.1%
Expectancy+0.14R
Net (sum of R)+33R
Trades244
Win rate +15.7 ptsExpectancy negative → positive

Raising the bar trades quantity for quality: win rate and per-trade expectancy climb, but fewer signals clear the gate — so total R can actually fall as the book thins out. Production runs the ≥0.50 gate to keep enough trades working while staying clearly positive.

The current filter

The control above is the actual production lever: signals must score at or above 0.50 confidence to be traded — that single gate is the entire difference between the “scanner only” and “scanner + ML” columns. Live, the same score then feeds the regime filter and EV-based risk tiers that size each position.

The flipped trades

The filter rejects ~52% of raw signals — but the lowest-confidence ones aren't discarded. When the model is confident a setup will fail, the system flips to the contrarian side (the opposing book). Backtested over the held-out set, those flips earn +0.23R (opposing short) and +0.39R per trade — a real second edge from the model's negative conviction. Full mechanics in the Strategy section.

Held-out test sets: 700 trades (long, exp 155), 498 (short, exp 147). Returns in R-multiples (loss = −1R, win = its reward-to-risk), booked at the scanner's primary 50% target (up to 200% on strong-breakout “adjusted” entries) — the 75% and 100% partials are tracked but not counted here. Equal-weighted, before commissions and slippage. Test-set performance is not a guarantee of live results.

Position management

How a trade scales in and out

Every position starts at a defined 1R and is managed from there — risk comes off fast, size is added only on confirmation, and the exit is one trailed bracket, never a discretionary guess.

01
Size on entry
Position size comes from the EV risk tier; the distance from entry to the stop is exactly 1R.
02
+0.5R → breakeven
Once the trade is half a unit of risk onside, the stop jumps to entry — the loss tail is cut early.
03
Day-2 add-on
If the move holds into the third session, a second entry scales the position up — the only place size is added.
04
Trail the stop
The stop ratchets behind price as the trade works, locking in more of the move on every favorable day.
05
Exit as one bracket
Closed at the take-profit or the trailed stop, whichever hits first. Expired longs are auto-closed by the pipeline.
Same playbook, different knobs
Primary (breakout)Opposing (fade)
Initial stopentry − 2 × ATRthe 50% (target-50) line
Add-on priceDay+2 closesignal entry price
Resize toward1.5 × original $ risk1.0 × original $ risk
Trail stop tolow/high of 2 pre-entry bars⅓ of the original risk
Take-profit100% target longs · 75% shorts (→1.5×R after add)toward the original stop (1×R)

One honest detail: the book scales in on strength, but it exits as a single managed bracket — a trailed stop and one take-profit — not in tranches. The 75% and 100% targets are tracked for context and feed the Day-2 resize, but the live exit is one order. Rules from move_stops_to_breakeven.py and day2_stop_trail.py.

Daily automation

One command runs the whole day

A single orchestrator walks the market from data to executed trades — logged, re-runnable, and human-approved by default. Click any step to see what it runs and which tables it touches.

Click a step to expand it
Engineering

What makes it hold up

Training–serving parity
Live scoring reuses the exact feature calculator from training, so the model sees identical inputs in production — no silent drift.
Idempotent, fault-tolerant pipeline
Ten steps, re-runnable from any point. Broker failures are non-fatal and logged; the day never half-breaks.
Risk decided first
Position size is a pure function of expected value and stop distance — caps, a margin-cushion gate, and tiers, never gut feel.
Safety by default
Dry-run and human approval are the defaults; a kill-switch file halts trading instantly. Unattended mode is explicit.
Dual-model design
High-confidence setups trade directly; low-confidence ones flip into a contrarian mean-reversion play with a dual entry.
Defined-risk options overlay
The same signals can be expressed as IBKR debit spreads, sized so the net debit is the entire risk.
What's next

Where it goes from here

The stock book is live; the next frontier is expressing the same signals as defined-risk options, then tightening the loop between live results and the models.

In paper · forward-testing

Defined-risk options overlay

The same ML signals can be expressed as IBKR debit spreads — buy a call near the 50% target, sell one near the 100% target. The net debit paid is the entire risk: the position can't lose more than it cost, by construction.

  • Long ~ target 50% · short ~ target 100% (vertical debit spread)
  • Net debit = maximum loss — risk capped up front
  • Auto-planned from the live IBKR option chain, sized ~1% of account
  • One combo order in; GTC conditional close at the side's target — long at 100%, short at 75%
  • Running in IBKR paper now — same kill-switch + approval safety as the stock book
Debit-spread payoff
$0long ~ T50short ~ T100max loss = net debitmax gainbreakevenUnderlying at expiry →
Paper → live promotion
The options book trades IBKR paper today. It graduates to real capital only after a clean forward-test — the same discipline the stock models passed before going live.
Continuous retraining & drift watch
Live calibration is tracked against the held-out curve; a drift monitor flags when the market regime shifts, triggering a retrain-and-repromote through the same experiment pipeline.
Modelling real costs
Today's backtest is gross. Next is folding commissions and slippage into the EV math, so position sizing reflects net edge rather than gross — especially important for multi-leg option fills.
Built with

The stack

Every layer of the system built from scratch — from SQL feature jobs to live IBKR bracket orders.

ML & modelling
PythonXGBoost 3.1Optuna 4.6 (TPE)scikit-learnscipyNumPypandaspickle
Data & database
PostgreSQLSupabaseSQLAlchemypsycopg2SQL feature jobsPolygon APIyfinance
Broker & execution
Interactive Brokersib_insyncIBKR Flex reportsIBKR paper tradingBracket ordersBAG combo orders
Automation & ops
Windows Task SchedulerGitHub ActionsPowerShellsmtplib + Jinja2openpyxlpyyaml
Operator UI
StreamlitFastAPI + uvicornReact + ViteRechartsPlotlyAltairMatplotlib
Portfolio site
Next.js 14Tailwind CSSFramer Motion