Developing confidence about a portfolio strategy’s track record (or throwing it onto the garbage heap), whether it’s your own design or a third party’s model, is a tricky but essential chore. There’s no single solution, but a critical piece of the analysis for estimating return and risk, including the potential for drawdowns and fat tails, is generating synthetic performance histories with a process called bootstrapping. The idea is based on simulating returns by drawing on actual results to see thousands of alternative histories to consider how the future may unfold. The dirty little secret in this corner of Monte Carlo analysis is that there’s more than one way to execute bootstrapping tests. To cut to the chase, block bootstrapping is a superior methodology for asset pricing because it factors in the reality that marktet returns exhibit autocorrelation. The bias for momentum–positive and negative–in the short run, in other words, can’t be ignored, as it is in standard bootstrapping.
There’s a tendency for gains and losses to persist—bear and bull markets are the obvious examples, although shorter, less extreme runs of persistence also mark the historical record as well. Conventional bootstrapping ignores this fact by effectively assuming that returns are independently distributed. They’re not, which is old news. The empirical literature demonstrates rather convincingly a strong bias for autocorrelation in asset returns. Designing a robust bootstrapping test on historical performance demands that we integrate autocorrelation into the number crunching to minimize the potential for generating misleading results.
The key point is recognizing that sampling historical returns for analysis should focus on multiple periods. Let’s assume that we’re looking at monthly performance data. A standard bootstrap would reshuffle the sequence of actual results and generate alternative return histories–randomly, based on monthly returns in isolation from one another. That would be fine if asset returns weren’t highly correlated in the short run. But as we know, positive and negative returns tend to persist for a stretch, sometimes in the extreme. The solution is sampling actual histories in blocks of time (in this case several months) to preserve the autocorrelation bias.
The question is how to choose the length for the blocks, along with some other parameters. Much depends on the historical record, the frequency of the data, and the mandate for the analysis. There’s a fair amount of nuance here. Fortunately, R offers several practical solutions, including the meboot package (“Maximum Entropy Bootstrap for Time Series”).
As an illustration, let’s use a couple of graphics to compare a standard bootstrap to a block bootstrap, based on monthly returns for the US stock market (S&P 500). To make this illustration clear in the charts, we’ll ignore the basic rules of bootstrapping and focus on a ridiculously short period: the 12 months through March 2016. If this was an actual test, I’d crunch the numbers as far back as history allows, which runs across decades. I’m also generating only ten synthetic return histories; in practice, it’s prudent to create thousands of data sets. But let’s dispense with common sense in exchange for an illustrative example.
The first graph below reflects a standard bootstrap—resampling the historical record with replacement. The actual monthly returns for the S&P (red line) are shown in context with the resampled returns (light blue lines). As you can see, the resampled performances represent a random mix of results via reshuffling the sequence of actual monthly returns. The problem is that the tendency for autocorrelation is severed in this methodology. In other words, the bootstrap sample is too random—the returns are independent from one another. In reality, that’s not an accurate description of market behavior. The bottom line: modeling history through this lens could, and probably will, lead us astray as to what could happen in the future.
Let’s now turn to block bootstrapping for a more realistic profile of market history. Note that the meboot package does most of the hard work here in choosing the length of the blocks. The details on the alogorithm are outlined in the vignette. For now, let’s just look at the results. As you can see in the second chart below, the resampled returns resemble the actual performance history. It’s obvious that the synthetic performances aren’t perfectly random. Depending on the market under scrutiny and the goal of the analytics, we can adjust the degree of randomness. The key point is that we have a set of synthetic returns that are similar to, but don’t quite match, the actual data set.
Note that no amount of financial engineering can completely wipe away uncertainty. The future can and probably will deliver surprises, for good and ill, no matter how clever our analytics. Nonetheless, bootstrapping historical data (or in-sample returns via backtests) can help separate the wheat from the chaff when looking into the rearview mirror as a preview of what lies ahead. But the details on how you run a bootstrap test are critical for developing comparatively high-confidence test results. In short, we can’t ignore a simple fact: market returns have an annoying habit of exhibit non-random behavior.
* * *
Previous articles in this series:
Portfolio Analysis in R: Part I | A 60/40 US Stock/Bond Portfolio
Portfolio Analysis in R: Part II | Analyzing A 60/40 Strategy
Portfolio Analysis in R: Part III | Adding A Global Strategy
Portfolio Analysis in R: Part IV | Enhancing A Global Strategy
Portfolio Analysis in R: Part V | Risk Analysis Via Factors
Portfolio Analysis in R: Part VI | Risk-Contribution Analysis