AlphaTRADER Academy
Backtesting Wyckoff
Wyckoff is interpretive. Backtesting needs mechanical rules. To verify your edge with data, you have to translate subjective Spring/UTAD/LPS recognition into deterministic conditions a script can evaluate. This lesson shows how — and where the translation introduces approximation error.
"In God we trust. All others must bring data." — W. Edwards Deming
The Subjectivity Problem
Read 5 Wyckoff books and you'll find 5 different definitions of "Spring". "Price wicks below support and immediately reverses on absorption" — try coding that. What's "support"? Last swing low? Lowest of last N bars? "Immediately" = same bar? Next bar? "Absorption" = high volume? Low spread on heavy volume?
Every interpretation is defensible. Every interpretation produces different trades. Backtesting forces you to commit to ONE specific mechanical version. The discipline of writing the rules clarifies your own thinking — even if you never run the backtest.
★ Wyckoff-to-Rules Translator
INTERACTIVEWybierz setup. Translator pokazuje subjective Wyckoff definition vs mechanical rule-based version + pseudo-code condition + tradeoff acknowledgment.
Subjective Wyckoff Definition
- ▸
Mechanical Rule-Based Version
- ▸
Walk-Forward Analysis — The Right Way
Naive backtest = optimize on history → fit perfectly → fail forward. Walk-forward simulates real trading conditions.
Naive Backtest (WRONG)
- 1. Take 5 years of data
- 2. Optimize 10 parameters until backtest is perfect
- 3. Get 80% win rate, 4R avg
- 4. Deploy live → 40% win rate, -0.5R avg
- 5. "Market changed" 😭
Result: curve-fit fantasy. Edge that never existed outside your spreadsheet.
Walk-Forward (CORRECT)
- 1. Split 5 years into IS (in-sample) + OOS (out-of-sample) windows
- 2. Optimize parameters on IS window 1 (e.g., 2020 H1)
- 3. Test those params on OOS window 1 (2020 H2)
- 4. Roll forward: optimize on 2020 H2, test on 2021 H1
- 5. Aggregate ALL OOS results — that's your real edge
Result: only OOS performance counts. Honest assessment of forward viability.
Rule of thumb: If your IS Sharpe is 3.0 and OOS Sharpe is 0.5, you have a curve-fit. If IS is 2.0 and OOS is 1.5, you have an edge. The OOS number is the only one that matters.
Edge Degradation — When Does Your System Stop Working?
Every edge eventually dies. Detection = surviving the regime change with capital intact.
Rolling Performance Window
Track win rate + avg R + Sharpe across rolling 30-trade windows. Plot the trend.
Trigger: If 60-trade rolling Sharpe drops below 50% of historical average for 3+ consecutive windows → degradation likely.
Per-Setup Decay
Some setups decay before others. Spring may still work while UTAD stops. Track per-setup win rate trends separately.
Trigger: Specific setup win rate drops >15% from historical baseline → drop that setup from playbook.
Regime Change Indicators
Macro regime shifts often precede edge degradation. New Fed cycle, new volatility regime, new market structure (algorithm changes).
Trigger: VIX 20MA crosses major threshold (e.g., 15 → 25 sustained) → re-validate strategy.
The hardest call: distinguishing degradation from normal drawdown. 30-trade losing streaks happen at 50%+ win rate. Use ATR-style adaptive thresholds, not absolute. When in doubt, halve size, don't stop entirely.
Reading Backtest Results — What Numbers Matter
Win rate alone is meaningless. These are the metrics pros judge a strategy by.
| Metric | What It Tells You | Healthy | Suspicious |
|---|---|---|---|
| Sharpe Ratio (annualized) | Risk-adjusted return — return per unit volatility | 1.5-2.5 | >3 (curve-fit) or <0.5 (no edge) |
| Profit Factor | Gross profit / Gross loss | 1.5-2.5 | >3 (rare/suspect) or <1.2 (thin) |
| Max Drawdown | Largest peak-to-trough loss | <25% (psychological survival zone) | >40% (hard to trade live) |
| Recovery Factor | Net profit / Max drawdown | >3 (resilient) | <1 (drawdown ate everything) |
| SQN (Van Tharp) | System Quality Number | 2.0-2.5 (excellent) | >3 (verify) or <1.5 (marginal) |
| Total Trades | Statistical significance of results | 200+ trades | <50 (not significant) |
| Max Consecutive Losses | Worst losing streak — psychological test | <7 | >12 (most can't survive emotionally) |
| Win Rate | Frequency of winners | Style-dependent (40-65% normal) | >75% (verify samples) or <30% |
Common Backtesting Traps
Where backtests lie convincingly. Avoid these or your "proven system" is fiction.
Backtesting Tools — Pick Per Skill Level
No "best" tool — depends on your coding skills and rigor needs.
| Tool | Skill Required | Pros | Cons |
|---|---|---|---|
| TradingView Pine Script | Beginner — beginner+ | Visual, fast iteration, integrated charts | Limited stats, no walk-forward natively |
| Python + backtesting.py / vectorbt | Intermediate | Free, flexible, full stats, walk-forward easy | Slower iteration, no visual feedback |
| QuantConnect / Backtrader | Intermediate-Advanced | Industry-grade, real broker API, Monte Carlo | Steeper learning curve, paid tiers for premium data |
| Custom (numpy + pandas) | Advanced | Total control, scriptable, max performance | Reinvent every wheel, manual stats calc |
| Excel / Spreadsheet | Anyone | Free, transparent, manual control | Tedious for >100 trades, no automation |
Recommended for Wyckoff traders: TradingView Pine Script for initial idea exploration → Python (backtesting.py) for serious validation with walk-forward. Avoid spreadsheet for any sample >50 trades.
Test Your Understanding
4 questions — instant feedback, no scoring stored.