Backtesting With Synthetic and Resampled Market Histories

We’re all backtesters in some degree, but not all backtested strategies are created equal. One of the more common (and dangerous) mistakes is 1) backtesting a strategy based on the historical record; 2) documenting an encouraging performance record; and 3) assuming that you’re done. Rigorous testing, however, requires more. Why? Because relying on one sample—even if it’s a real-world record—doesn’t usually pass the smell test. What’s the problem? Your upbeat test results could be a random outcome.

The future’s uncertain no matter how rigorous your research, but a Monte Carlo simulation is well suited for developing a higher level of confidence that a given strategy’s record isn’t a spurious byproduct of chance. This is a critical issue for short-term traders, of course, but it’s also relevant for portfolios with medium- and even long-term horizons. The increased focus on risk management in the wake of the 2008 financial crisis has convinced a broader segment of investors and financial advisors to embrace a variety of tactical overlays. In turn, it’s important to look beyond a single path in history.

Research such as Meb Faber’s influential paper “A Quantitative Approach to Tactical Asset Allocation” and scores of like-minded studies have convinced former buy-and-holders to add relatively nimble risk-management overlays to the toolkit of portfolio management. The results may or may not be satisfactory, depending on any number of details. But to the extent that you’re looking to history for guidance, as you should, it’s essential to look beyond a single run of data in the art/science of deciding if a strategy is the genuine article.

The problem, of course, is that the real-world history of markets and investment funds is limited–particularly with ETFs, most of which arrived within the past ten to 15 years. We can’t change this obstacle, but we can soften its capacity for misleading us by running alternative scenarios via Monte Carlo simulations. The results may or may not change your view of a particular strategy. But if the stakes are high, which is usually the case with portfolio management, why wouldn’t you go the extra mile? The major hazard of ignoring this facet of analysis leaves you vulnerable. At the very least, it’s valuable to have additional support for thinking that a given technique is the real deal. But sometimes, Monte Carlo simulations can avert a crisis by steering you away from a strategy that appears productive but in fact is anything but.

As one simple example, imagine that you’re reviewing the merits of a 50-day/100-day moving average crossover strategy with a one-year rolling-return filter. This is a fairly basic set-up for monitoring risk and/or exploiting the momentum effect, and it’s shown encouraging results in some instances—applying it to the ten major US equity sectors, for instance. Let’s say that you’ve analyzed the strategy’s history via the SPDR sector ETFs and you like what you see. But here’s the problem: the ETFs have a relatively short history overall… not much more than 10 years’ worth of data. You could look to the underlying indexes for a longer run of history, but here too you’ll run up against a standard hitch: the results reflect a single run of history.

Monte Carlo simulations offer a partial solution. Two applications I like to use: 1) resampling the existing history by way or reordering the sequence of returns; and 2) creating synthetic data sets with specific return and risk characteristics that approximate the real-world funds that will be used in the strategy. In both cases, I take the alternative risk/return histories and run the numbers through the Monte Carlo grinder.

Using R to generate the analysis offers the opportunity to re-run tens of thousands of alternative histories. This is a powerful methodology for stress-testing a strategy. Granted, there are no guarantees, but deploying a Monte Carlo-based analysis in this way offers a deeper look at a strategy’s possible outcomes. It’s the equivalent of exploring how the strategy might have performed over hundreds of years during a spectrum of market conditions.

As a quick example, let’s consider how a 10-asset portfolio stacks up in 100 runs based on normally distributed returns over a simulated 20-year period of daily results. If this was a true test, I’d generate tens of thousands of runs, but for now let’s keep it simple so that we have some pretty eye candy to look at to illustrate the concept. The chart below reflects 100 random results for a strategy over 5040 days (20 years) based on the following rules: go long when the 50-day exponential moving average (EMA) is above the 100-day EMA and the trailing one-year return is positive. If either one of those conditions doesn’t apply, the position is neutral, in which case the previous buy or sell signal applies. If both conditions are negative (i.e., 50-day EMA below 100 day and one-year return is negative), then the position is sold and the assets are placed in cash, with zero return until a new buy signal is triggered.

port.sim.29jun2015

Note that each line reflects applying these rules to a 10-asset strategy and so we’re looking at one hundred different aggregated portfolio outcomes (all with starting values of 100). The initial results look encouraging, in part because the median return is moderately positive (+22%) over the sample period and the interquartile performance ranges from roughly +10% to +39%. The worst performance is a loss of a bit more than 7%.

The question, of course, is how this compares with a relevant benchmark? Also, we could (and probably should) run the simulations with various non-normal distributions to consider how fat-tail risk influences the results. In fact, the testing outlined above is only the first step if this was a true analytical project.

The larger point is that it’s practical and prudent to look beyond the historical record for testing strategies. The case for doing so is strong for both short-term trading tactics and longer-term investment strategies. Indeed, the ability to review the statistical equivalent of hundreds of years of market outcomes, as opposed to a decade or two, is a powerful tool. The one-sample run of history is an obvious starting point, but there’s no reason why it should have the last word.

8 thoughts on “Backtesting With Synthetic and Resampled Market Histories

  1. Govind

    Monte Carlo simulation can be useful when you have a limited amount of data. However, they work best when the data is independent. When there is serial correlation, as there is with many financial returns, it is more difficult to have meaningful simulation results using randomized data.

    Even though ETF data only goes back 10-15 years, there are indices that have hundred years of available data. A better way to validate a trading approach is to apply it to as many years of out-of-sample data as is possible. There are several papers now that have tested momentum and trend following methods on over 200 years of data and even a book by Greyserman and Kaminski that looks at 800 years of out-of-sample data!

  2. rene

    Thanks for an insightful post

    Can you provide any relevant litterature so I Can explore your outlined concepts further?

  3. James Picerno Post author

    Rene,
    The literature is quite vast but, alas, nothing that I can think of that serves as one solid primer. The good news is that you can find a lot of insight on the Internet as well as by searching through Amazon or the equivalent. The key terms are “backtesting” and “Monte Carlo”. Run those through Google and you’ll find quite a lot of reading material. The challenge is finding the right focus. Because MC is sort of generic term about process rather than details, there are many, many variations and applications. But this is a topic that you’ll have to spend some time researching to find content that’s applicable for your situation. Best of luck!

  4. James Picerno Post author

    Govind,
    Agreed. There are some time series that go way back. The US stock market, for instance, has been documented back to the late-1700s. The bigger challenge is finding historical results when it comes to multi-asset class strategies, which is my focus. That’s a tougher nut to crack. Keep in mind too that backtesting indexes is one thing, but not always equivalent to backtesting funds, which are a closer approximation of real-world results. When/if there are holes in the historical record, simulation can help. I’d argue that it’s always a good idea to run simulated tests on strategies. It’s relatively painless, virutally cost free (other than the time), and so there’s really no reason not to kick the tires.

  5. Rene

    Hi,

    Thanks for your answer – it makes sense. Summer is a good time to catch up on MC 🙂

  6. Pingback: Quantocracy's Daily Wrap for 06/29/2015 | Quantocracy

  7. d

    hi james
    can you elaborate on this “As a quick example, let’s consider how a 10-asset portfolio stacks up in 100 runs based on normally distributed returns over a simulated 20-year period of daily results. “?

    did you take the daily return and daily sd of those 10-asset classes from the historical record and then simulate each of them as a normal distribution w those same parameters at a daily time step for the 20 year period?

    many thanks!

  8. James Picerno Post author

    d,
    I simulated daily returns for 20 years for 10 assets based on a normal distribution. Ideally this should be done with fat-tailed distributions, but this is just a toy example. I then replicated that 10-asset portfolio with 100 random outcomes, also using a normal distribution.

Comments are closed.