Backtesting Methodology: How to Avoid Lying to Yourself

95% of backtests are garbage. They use future data, ignore transaction costs, test on survivor-biased data, and curve-fit parameters until the equity curve looks perfect. Then traders risk real money and wonder why the strategy fails. Here's how to backtest properly—and why most people won't.

🚨 The Brutal Truth

A backtest is a hypothesis, not a guarantee. Even a perfectly executed backtest can fail forward because:

  • Markets change (strategies stop working when discovered)
  • Regimes shift (what worked in low interest rates fails in high rates)
  • You'll execute imperfectly (emotions, slippage, missed signals)

The goal of backtesting: Eliminate obviously broken strategies and increase confidence in robust ones. Not to predict the future.

The Seven Deadly Sins of Backtesting

Sin #1: Look-Ahead Bias (Using Future Data)

What it is: Using information that wouldn't have been available at the time of the trade.

Example (wrong):

# WRONG: Using today's close to calculate indicator that triggers today's trade
rsi = calculate_rsi(df['Close'])
df['Signal'] = (rsi < 30).astype(int)  # Buy when RSI < 30

# Problem: Today's RSI includes today's close, but you'd need to decide
# to buy BEFORE the close. You're peeking into the future!

Example (correct):

# CORRECT: Use yesterday's close to generate today's signal
rsi = calculate_rsi(df['Close']).shift(1)  # Shift forward by 1 day
df['Signal'] = (rsi < 30).astype(int)

# Now today's signal is based on data available yesterday

Other look-ahead bias traps:

  • Calculating volatility with today's data: ATR, standard deviation must use yesterday's close
  • Using earnings data: Earnings are released after market close—don't trade the same day unless using after-hours data
  • Rebalancing on wrong date: Index rebalances are announced weeks before execution—trade on execution date, not announcement
  • Dividends: Ex-dividend date vs payment date (price drops on ex-div, not payment)

💡 Real-World Example: The Moving Average Disaster

A trader backtests a 50-day SMA crossover system and gets spectacular results:

  • Backtest CAGR: 28.4% (wow!)
  • Sharpe ratio: 1.8 (amazing!)
  • Max drawdown: Only -12% (too good to be true!)

The bug: He calculated the 50-day SMA using today's close, then bought at today's close. In reality, you calculate SMA at yesterday's close, decide to trade overnight, then execute tomorrow at open.

After fixing look-ahead bias:

  • CAGR drops to 9.2% (still decent)
  • Sharpe falls to 0.61 (mediocre)
  • Max DD increases to -31% (ouch)

Lesson: One line of code (.shift(1)) turns a "holy grail" into an average strategy.

Sin #2: Survivorship Bias (Testing Only Survivors)

What it is: Testing only stocks/assets that still exist today, ignoring those that went bankrupt or were delisted.

The impact:

Dataset CAGR Max DD Win Rate
Survivors only
(current S&P 500 members)
18.2% -22.1% 64%
All stocks
(including delistings/bankruptcies)
11.4% -38.7% 52%
Survivorship bias impact +6.8% +16.6% +12%

Why it matters:

  • 40% of stocks from 2000 are no longer traded (bankrupt, acquired, or delisted)
  • Your backtest on "current S&P 500" excludes Lehman Brothers, Bear Stearns, Enron, WorldCom—all massive losses
  • You're testing on a portfolio of winners, not reality

How to fix:

  • Use point-in-time data: Databases that include delistings (Norgate, Sharadar, Quandl)
  • For ETFs/indices: Use actual ETF price data (survivorship is built-in correctly)
  • If stuck with Yahoo Finance: Acknowledge results are optimistic by 3-7% annually

Sin #3: Data Snooping / Data Mining (Testing 100 Parameters)

What it is: Testing hundreds of parameter combinations until you find one that works, then claiming you "discovered" a great strategy.

Example: You test a moving average crossover system with:

  • Fast MA: 5, 10, 15, 20, 25, 30, ..., 100 (20 options)
  • Slow MA: 50, 75, 100, 125, ..., 300 (12 options)
  • Total combinations: 20 × 12 = 240 tests

After testing all 240 combinations, you find that MA(17, 143) produced a 24% CAGR with only -8% max drawdown. Eureka!

The reality: You just found random noise. With 240 tests, you'd expect 12 combinations to work purely by chance (5% false positive rate). MA(17, 143) has no logical reason to work—it's curve-fit to historical quirks that won't repeat.

🎰 The Multiple Testing Problem

Statistical significance breaks down when you test multiple hypotheses:

  • 1 test: 5% chance of false positive (1 in 20)
  • 20 tests: 64% chance of at least one false positive
  • 100 tests: 99.4% chance of at least one false positive
  • 240 tests: You're guaranteed to find something that looks great but is pure noise

Bonferroni correction: If testing N hypotheses, use significance level of 0.05/N. For 240 tests, you'd need p < 0.0002 to be confident—almost impossible.

How to avoid data snooping:

  1. Use theory-based parameters: 50/200 SMA is used because 50 days ≈ 2 months, 200 days ≈ 1 year. Logical, not optimized.
  2. Limit optimization: Test 3-5 parameter sets max, not 240
  3. Out-of-sample testing: Optimize on 1990-2010, test on 2011-2024 (see Sin #4)
  4. Robustness testing: Does MA(45, 190) work almost as well as MA(50, 200)? If yes, robust. If only MA(50, 200) works, curve-fit.

Sin #4: No Out-of-Sample Testing

What it is: Using all your data to build the strategy, then testing on the same data you used to build it.

The proper approach:

Dataset Purpose Usage
Training (60%)
1990-2008
Build strategy, optimize parameters You can look at this data, tweak rules, adjust stops—go wild
Validation (20%)
2009-2016
Test different parameter sets, choose best Used to select final parameters from candidates
Test (20%)
2017-2024
Final reality check—LOOK ONCE ONLY Do not touch until strategy is finalized. Run once, accept results.

The discipline required:

  • Never look at test data until your strategy is 100% finalized
  • If test results are bad, you CANNOT go back and tweak the strategy (that's data snooping on test set)
  • Accept the results: If test shows 8% CAGR vs training's 15%, that's reality speaking

Sin #5: Ignoring Transaction Costs

Transaction costs are not just commissions. The full cost includes:

Cost Type Typical Range Notes
Commissions $0-$1 per trade Free on Robinhood, Fidelity, Schwab. Ignore this.
Bid-Ask Spread 0.01-0.50% SPY: 0.01%, Small-cap: 0.20-0.50%. Paid twice per round-trip.
Slippage 0.05-0.30% Market orders during volatility. Worse for illiquid stocks.
Market Impact 0.10-1.00% Large orders (>$100k) move the market. Worse for small caps.
Total (Round-Trip) 0.10-2.00% Conservative estimate: 0.15% for liquid ETFs, 0.50% for individual stocks

Annual impact examples:

Strategy Trades/Year Cost per Trade Annual Cost
Dual Momentum (monthly rebalance) 12 0.10% 1.2%
Trend Following (Donchian) 40 0.15% 6.0%
Mean Reversion (daily signals) 120 0.20% 24.0% (strategy is dead!)

💡 Real Example: The 6% Reality Check

A trader backtests a trend-following strategy:

  • Backtest (no costs): 18.2% CAGR, 0.95 Sharpe, looks amazing
  • After adding 0.15% per trade (40 trades/year): 12.2% CAGR, 0.68 Sharpe (still decent)
  • Performance drop: 6.0% annually from costs alone

Lesson: High-frequency strategies look great on paper but die from costs. Monthly rebalancing beats weekly in real-world trading.

Sin #6: Ignoring Market Impact (Scalability)

What it is: Your $10k backtest shows great results, but scaling to $1M breaks the strategy because your orders move the market.

Example: Small-cap momentum strategy

  • Stock: XYZ, market cap $200M, average daily volume $2M
  • Your order: $100k (5% of daily volume)
  • Impact: Bid-ask spread widens from 0.20% to 0.80%, you get filled 0.60% worse than backtest
  • Annual cost (40 trades): 40 × 0.60% = 24% drag

Rule of thumb for market impact:

  • Safe: Your order < 0.5% of average daily volume (minimal impact)
  • Moderate: 0.5-2% of volume (add 0.10-0.30% slippage)
  • Large: 2-10% of volume (add 0.50-2.00% slippage)
  • Impossible: >10% of volume (you'll move the market significantly, strategy won't scale)

Sin #7: Curve-Fitting (Overfitting to Noise)

What it is: Creating a strategy with so many rules and conditions that it perfectly fits historical data but has zero predictive power.

Example of overfitting:

# Overfit strategy with 15 conditions
def overfit_strategy(data):
    # Buy if ALL conditions are true:
    if (RSI(14) > 32 and RSI(14) < 47 and
        MACD > signal_line and
        SMA(50) > SMA(200) and
        Volume > SMA_volume(20) * 1.43 and
        Close > Open and
        High - Low < ATR * 1.67 and
        Day_of_week in [1, 3, 5] and  # Monday, Wednesday, Friday
        Month not in [6, 7, 8] and    # Avoid summer
        VIX < 22.4 and VIX > 11.8 and
        Dollar_index < 104.2):
        return 'BUY'

Why this is garbage:

  • 15 arbitrary conditions (why RSI > 32, not 30 or 35?)
  • Hyper-specific thresholds (VIX < 22.4? Really?)
  • Too many degrees of freedom (fits noise, not signal)
  • In-sample performance: 42% CAGR (wow!)
  • Out-of-sample performance: -2.8% (disaster)

Signs of overfitting:

  1. Too many parameters: >5 inputs = likely overfit
  2. Hyper-specific thresholds: "RSI must be between 33.7 and 41.2" (nonsense)
  3. Perfect equity curve: No losing years, <10% max DD over 20 years (impossible without hindsight)
  4. Huge in-sample vs out-of-sample gap: 30% CAGR in training, 5% in validation (curve-fit)
  5. No logical explanation: Can't explain WHY the rule works (just that it does in backtest)

The Proper Backtesting Framework

Step 1: Hypothesis-Driven Development

Start with a theory, not data mining:

Good hypothesis: "Momentum persists because investors under-react to news and trends continue. I'll buy 12-month winners and hold until momentum reverses."

Bad hypothesis: "I noticed that when RSI crosses 37.4 on a Tuesday in months ending in 'r', stocks go up. Let me test this."

Theory-based strategies:

  • Momentum: Behavioral (herding, under-reaction)
  • Mean reversion: Overreaction to news, regression to mean
  • Value: Market misprices fundamentals, corrects over time
  • Carry: Compensation for providing liquidity (interest rate differentials)

Step 2: Walk-Forward Analysis

What it is: Simulating real-world strategy development by continuously training on past data, testing on future data, then rolling forward.

Example (20-year backtest, 2000-2020):

Iteration Training Period Testing Period
1 2000-2004 (5 years) 2005 (1 year)
2 2001-2005 (5 years) 2006 (1 year)
3 2002-2006 (5 years) 2007 (1 year)
... ... ...
15 2014-2018 (5 years) 2019 (1 year)
16 2015-2019 (5 years) 2020 (1 year)

Benefits:

  • Simulates real-world conditions (you optimize on past data, trade on future data)
  • Tests robustness across different market regimes
  • Reveals if parameters need constant re-optimization (bad sign)

Step 3: Monte Carlo Simulation

What it is: Randomizing trade sequence to test if your results are robust or just lucky.

How it works:

  1. Take your historical trades (say, 200 trades over 10 years)
  2. Randomly shuffle the order 1,000 times
  3. Calculate performance for each shuffle
  4. Plot distribution of outcomes

Example results:

  • Your backtest CAGR: 15.2%
  • Monte Carlo mean: 12.8% (your result is above average)
  • 95% confidence interval: 6.4% to 18.9%
  • % of simulations profitable: 87% (good robustness)

Red flags from Monte Carlo:

  • Your backtest is in the top 5% of simulations (got lucky with trade sequence)
  • Wide confidence intervals (15.2% ± 12%, too much luck involved)
  • <% of simulations profitable < 70% (not robust)

Python Backtesting Framework

Here's a complete framework that avoids all seven deadly sins:

"""
Proper Backtesting Framework
Author: Plan My Retire Finance University
Date: 2026-02-22

Features:
- No look-ahead bias (all indicators shifted)
- Transaction costs included
- Walk-forward analysis
- Monte Carlo simulation
- Out-of-sample testing
"""

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

class BacktestFramework:
    """
    Rigorous backtesting framework avoiding common pitfalls

    Parameters:
    -----------
    transaction_cost : float
        Cost per trade as percentage (0.001 = 0.1% per trade)
    slippage : float
        Slippage as percentage (0.0005 = 0.05%)
    """

    def __init__(self, transaction_cost=0.001, slippage=0.0005):
        self.transaction_cost = transaction_cost
        self.slippage = slippage
        self.total_cost = transaction_cost + slippage

    def backtest(self, data, signals, initial_capital=100000):
        """
        Run backtest with proper cost modeling

        Parameters:
        -----------
        data : DataFrame
            Price data with OHLC columns
        signals : Series
            Trading signals (1 = long, 0 = flat, -1 = short)
            MUST be shifted to avoid look-ahead bias
        initial_capital : float
            Starting capital

        Returns:
        --------
        results : dict
            Performance metrics
        """
        df = data.copy()
        df['Signal'] = signals
        df['Position'] = df['Signal'].diff()

        # Track portfolio
        cash = initial_capital
        shares = 0
        portfolio_values = []
        trades = []

        for i in range(len(df)):
            row = df.iloc[i]
            date = df.index[i]
            price = row['Close']

            # Calculate portfolio value
            portfolio_value = cash + (shares * price)
            portfolio_values.append({
                'Date': date,
                'Portfolio_Value': portfolio_value,
                'Cash': cash,
                'Shares': shares
            })

            # Execute trades
            if row['Position'] == 1:  # Buy signal
                if cash > 0:
                    # Apply slippage and costs
                    execution_price = price * (1 + self.slippage)
                    shares = int((cash * 0.99) / execution_price)
                    cost = shares * execution_price
                    commission = cost * self.transaction_cost
                    cash -= (cost + commission)

                    trades.append({
                        'Date': date,
                        'Type': 'BUY',
                        'Price': execution_price,
                        'Shares': shares,
                        'Cost': cost + commission
                    })

            elif row['Position'] == -1:  # Sell signal
                if shares > 0:
                    # Apply slippage and costs
                    execution_price = price * (1 - self.slippage)
                    proceeds = shares * execution_price
                    commission = proceeds * self.transaction_cost
                    cash += (proceeds - commission)

                    trades.append({
                        'Date': date,
                        'Type': 'SELL',
                        'Price': execution_price,
                        'Shares': shares,
                        'Proceeds': proceeds - commission
                    })

                    shares = 0

        # Convert to DataFrames
        equity_df = pd.DataFrame(portfolio_values).set_index('Date')
        trades_df = pd.DataFrame(trades)

        # Calculate metrics
        metrics = self.calculate_metrics(equity_df, initial_capital)

        return {
            'metrics': metrics,
            'equity_curve': equity_df,
            'trades': trades_df
        }

    def calculate_metrics(self, equity_df, initial_capital):
        """Calculate comprehensive performance metrics"""
        returns = equity_df['Portfolio_Value'].pct_change().dropna()

        final_value = equity_df['Portfolio_Value'].iloc[-1]
        total_return = (final_value - initial_capital) / initial_capital

        # CAGR
        days = (equity_df.index[-1] - equity_df.index[0]).days
        years = days / 365.25
        cagr = (final_value / initial_capital) ** (1 / years) - 1

        # Risk metrics
        volatility = returns.std() * np.sqrt(252)
        sharpe = (cagr - 0.04) / volatility if volatility > 0 else 0

        # Downside deviation (for Sortino)
        downside = returns[returns < 0].std() * np.sqrt(252)
        sortino = (cagr - 0.04) / downside if downside > 0 else 0

        # Drawdown
        cumulative = (1 + returns).cumprod()
        running_max = cumulative.expanding().max()
        drawdown = (cumulative - running_max) / running_max
        max_drawdown = drawdown.min()

        # Calmar ratio
        calmar = cagr / abs(max_drawdown) if max_drawdown != 0 else 0

        # Win rate
        win_rate = (returns > 0).sum() / len(returns) if len(returns) > 0 else 0

        return {
            'final_value': final_value,
            'total_return': total_return,
            'cagr': cagr,
            'volatility': volatility,
            'sharpe_ratio': sharpe,
            'sortino_ratio': sortino,
            'max_drawdown': max_drawdown,
            'calmar_ratio': calmar,
            'win_rate': win_rate
        }

    def walk_forward_analysis(self, data, strategy_func, train_period=252*5,
                             test_period=252, step=252):
        """
        Walk-forward analysis

        Parameters:
        -----------
        data : DataFrame
            Price data
        strategy_func : function
            Function that takes data and returns signals
        train_period : int
            Training window (days)
        test_period : int
            Testing window (days)
        step : int
            Step size (days) for rolling window

        Returns:
        --------
        results : dict
            Walk-forward results
        """
        results = []
        start = train_period

        while start + test_period < len(data):
            # Split data
            train_data = data.iloc[start - train_period:start]
            test_data = data.iloc[start:start + test_period]

            # Generate signals on training data
            train_signals = strategy_func(train_data)

            # Test on out-of-sample data
            test_signals = strategy_func(test_data)

            # Backtest
            backtest_result = self.backtest(test_data, test_signals)

            results.append({
                'train_start': train_data.index[0],
                'train_end': train_data.index[-1],
                'test_start': test_data.index[0],
                'test_end': test_data.index[-1],
                'cagr': backtest_result['metrics']['cagr'],
                'sharpe': backtest_result['metrics']['sharpe_ratio'],
                'max_dd': backtest_result['metrics']['max_drawdown']
            })

            start += step

        return pd.DataFrame(results)

    def monte_carlo_simulation(self, trades_df, n_simulations=1000, initial_capital=100000):
        """
        Monte Carlo simulation by randomizing trade sequence

        Parameters:
        -----------
        trades_df : DataFrame
            Historical trades with PnL
        n_simulations : int
            Number of simulations
        initial_capital : float
            Starting capital

        Returns:
        --------
        results : dict
            Distribution of outcomes
        """
        # Calculate PnL per trade
        buy_trades = trades_df[trades_df['Type'] == 'BUY'].reset_index(drop=True)
        sell_trades = trades_df[trades_df['Type'] == 'SELL'].reset_index(drop=True)

        if len(buy_trades) != len(sell_trades):
            print("Warning: Unmatched trades, adjusting...")
            min_len = min(len(buy_trades), len(sell_trades))
            buy_trades = buy_trades.iloc[:min_len]
            sell_trades = sell_trades.iloc[:min_len]

        trade_returns = (sell_trades['Price'].values - buy_trades['Price'].values) / buy_trades['Price'].values

        # Run simulations
        final_values = []

        for _ in range(n_simulations):
            # Randomly shuffle trade sequence
            shuffled_returns = np.random.permutation(trade_returns)

            # Calculate cumulative return
            portfolio_value = initial_capital
            for ret in shuffled_returns:
                portfolio_value *= (1 + ret)

            final_values.append(portfolio_value)

        final_values = np.array(final_values)
        cagr_values = (final_values / initial_capital) ** (1 / 10) - 1  # Assuming 10 years

        return {
            'mean_final_value': final_values.mean(),
            'median_final_value': np.median(final_values),
            'std_final_value': final_values.std(),
            'percentile_5': np.percentile(final_values, 5),
            'percentile_95': np.percentile(final_values, 95),
            'mean_cagr': cagr_values.mean(),
            'prob_profitable': (final_values > initial_capital).sum() / n_simulations,
            'distribution': final_values
        }


# Example: Dual Momentum Strategy (properly implemented)
def dual_momentum_strategy(data):
    """
    Dual momentum with NO look-ahead bias

    Returns signals shifted to avoid peeking into future
    """
    df = data.copy()

    # Calculate 12-month return (252 trading days)
    df['Return_12M'] = df['Close'].pct_change(252)

    # Signal: 1 if 12-month return positive, 0 otherwise
    # CRITICAL: Shift by 1 to avoid look-ahead bias
    df['Signal'] = (df['Return_12M'] > 0).astype(int).shift(1)

    return df['Signal'].fillna(0)


# Example usage
if __name__ == "__main__":
    # Download data
    print("Downloading SPY data...")
    data = yf.download('SPY', start='2000-01-01', end='2024-12-31', progress=False)

    # Generate signals (properly shifted)
    signals = dual_momentum_strategy(data)

    # Initialize framework
    framework = BacktestFramework(transaction_cost=0.001, slippage=0.0005)

    # Run backtest
    print("\nRunning backtest...")
    results = framework.backtest(data, signals, initial_capital=100000)

    # Print results
    print("\n" + "="*60)
    print("BACKTEST RESULTS (Dual Momentum)")
    print("="*60)
    metrics = results['metrics']
    print(f"CAGR:              {metrics['cagr']:.2%}")
    print(f"Volatility:        {metrics['volatility']:.2%}")
    print(f"Sharpe Ratio:      {metrics['sharpe_ratio']:.2f}")
    print(f"Sortino Ratio:     {metrics['sortino_ratio']:.2f}")
    print(f"Max Drawdown:      {metrics['max_drawdown']:.2%}")
    print(f"Calmar Ratio:      {metrics['calmar_ratio']:.2f}")
    print(f"Win Rate:          {metrics['win_rate']:.2%}")
    print(f"Number of Trades:  {len(results['trades'])}")
    print("="*60)

    # Walk-forward analysis
    print("\nRunning walk-forward analysis...")
    wf_results = framework.walk_forward_analysis(
        data, dual_momentum_strategy,
        train_period=252*5,  # 5 years training
        test_period=252,     # 1 year testing
        step=252             # Roll forward 1 year
    )
    print("\nWalk-Forward Results:")
    print(f"Average CAGR:      {wf_results['cagr'].mean():.2%}")
    print(f"Std Dev CAGR:      {wf_results['cagr'].std():.2%}")
    print(f"Best Year:         {wf_results['cagr'].max():.2%}")
    print(f"Worst Year:        {wf_results['cagr'].min():.2%}")

    # Monte Carlo simulation
    if len(results['trades']) > 10:
        print("\nRunning Monte Carlo simulation...")
        mc_results = framework.monte_carlo_simulation(
            results['trades'],
            n_simulations=1000,
            initial_capital=100000
        )
        print("\nMonte Carlo Results (1,000 simulations):")
        print(f"Mean Final Value:  ${mc_results['mean_final_value']:,.0f}")
        print(f"Median Final Value: ${mc_results['median_final_value']:,.0f}")
        print(f"5th Percentile:    ${mc_results['percentile_5']:,.0f}")
        print(f"95th Percentile:   ${mc_results['percentile_95']:,.0f}")
        print(f"Prob. Profitable:  {mc_results['prob_profitable']:.1%}")

Key Takeaways

✅ The Proper Backtesting Checklist

Before you trust any backtest (yours or others'), verify:

  1. ✅ No look-ahead bias: All indicators shifted by at least 1 period
  2. ✅ Survivorship-free data: Includes delisted/bankrupt stocks
  3. ✅ Limited parameter optimization: <5 parameters, theory-driven
  4. ✅ Out-of-sample test: 20% of data held back, tested once only
  5. ✅ Transaction costs: 0.10-0.20% per round-trip minimum
  6. ✅ Market impact considered: Strategy won't work at $10M scale? Document it.
  7. ✅ Simple rules: Can explain strategy in 2-3 sentences
  8. ✅ Walk-forward analysis: Performance consistent across rolling periods
  9. ✅ Monte Carlo tested: >70% of simulations profitable

If any of these fail: The backtest is unreliable. Fix it before risking capital.

Final Thoughts: The 50% Rule

Assume your real-world returns will be 50% of your backtest CAGR.

Why?

  • Hidden costs: You underestimated slippage, spreads, market impact
  • Execution errors: Missed signals, emotional overrides, platform issues
  • Regime changes: Markets evolve, strategies decay
  • Unknown biases: Your backtest has bugs you haven't found yet

Example:

  • Backtest shows 16% CAGR
  • Expect 8% CAGR in reality
  • If you achieve 10-12%, you outperformed expectations
  • If you achieve 4-6%, you underperformed but within normal degradation

💡 The Honest Approach

Professional quant funds assume:

  • 50-70% of backtest CAGR in live trading
  • Max drawdown will be 1.5× backtest (if backtest shows -20%, expect -30%)
  • Strategy will degrade 1-2% per year as others discover it

If you're more optimistic than this, you're lying to yourself.

Next Steps

You now have the tools to backtest properly. Before trading real money:

  1. Backtest your strategy using this framework
  2. Run walk-forward analysis (does it work across different regimes?)
  3. Monte Carlo test (are you just lucky, or is it robust?)
  4. Paper trade 6-12 months (track every signal in real-time)
  5. Start small (trade 25% of intended size for first year)
  6. Accept degradation (if backtest showed 14% and you get 8%, that's success)

Recommended reading:

  • Evidence-Based Technical Analysis by David Aronson (the bible of proper testing)
  • Advances in Financial Machine Learning by Marcos López de Prado
  • Quantitative Trading by Ernest Chan

⚠️ Risk Disclosure

Trading involves substantial risk of loss. Most traders lose money. Past performance does not guarantee future results. Backtests are hypothetical and do not represent actual trading. You should never trade with money you can't afford to lose, always use proper position sizing and risk management, and thoroughly test any strategy before risking capital. Consult with a licensed financial advisor before making investment decisions. The authors are not responsible for trading losses.