Outlier Risk, Part II

In the first article in this series we looked at a basic technique for identifying outliers – extreme data points – in a time series. The tool of choice: slicing the numbers into quartiles and using the interquartile range (IQR) to define “normal.” It’s a useful starting point, but it may not always suffice. Fortunately, there are alternative methodologies that can be run for additional perspective. For comparison, let’s focus on one and review how outlier risk stacks up using Z-scores.

Before we dive in, a quick recap. Detecting outliers is useful in economics and finance, along with many other disciplines, because it indicates when trends are extreme and therefore at higher risk of reversing. In the previous article we used rolling 1-year returns for the US stock market (S&P 500) and so we’ll continue with this example.

First, a brief definition of Z-scores, a.k.a. standard scores. The basic takeaway: converting data into Z-scores is a process of rescaling the numbers. Simply put, a Z-score tells you how far away any one data point is from the population mean. The divergences are presented as standard deviations.

Transforming data into deviations from the mean via standard deviations (SD) provides an objective yardstick to search for outliers. As with any statistical application, there are specific pros and cons. Leaving that aside for now, basic statistics tells us that 1SD captures roughly 68% of the variation, 2SD represents ~95% and 3SD is 99.7% or higher. In many applications 2SD is a common line for deciding if a data point is “normal” or not.

Analyzing data through the lens of Z-scores takes away quite a lot of the ambiguity and subjectivity for deciding what constitutes “extreme”. Yes, IQR offers this lens too, but the process is simply dividing up the results by using the full range of the numbers as a guide. Z-scores do the same but adds an additional filter – standard deviation vs. raw numbers – and so this method is arguably superior.

At the very least, Z-scores provide a check on IQR results. If both methods align, that’s a much stronger signal than relying one either one in isolation. On the other hand, if there’s a conflict, that may be an indication that you need to go deeper in your outlier research.

As a simple example, let’s review S&P 500’s rolling one-year returns in Z-scores.

As the chart above shows, there are several instances where returns exceed 2SD, on the upside and downside. If we use 2SD as an indication of “extreme”, the Z-score history shows that about 1.5% of the return population (calculated daily) since 1961 are outliers. On the upside, those extreme returns range from about 40% to 75%.

By comparison, the IQR results reflect a range of 0% to 19%, which means that “extreme” via IQR on the upside is defined as returns above 19% — a much lower bar vs. the Z-score results we’re using. (Note that the current 1-year return is roughly +32%, as of Nov. 9, 2021.)

Which method is right? Or wrong? Neither. We’re simply running statistical tests and beauty (and statistical truth) is in the eye of the beholder. With two sets of results in hand, we need to step back and review our assumptions and parameters. Is 2SD appropriate for this Z-score analysis? Is IQR inappropriate? What, exactly, are we trying to analyze/anticipate when looking at 1-year stock market performance?

The point is that there are no simple rules for defining outliers. Much depends on what we’re trying to achieve. This process isn’t just numbers — it’s about us. Two investors running the same analysis may come to two different conclusions due to different expectations, objectives, risk tolerance, etc.

It’s clear that even for this simple example there’s more work to be done. We’ve made a good start with developing basic results, but additional testing is needed to define “outlier” with a higher degree of confidence.

In the next installment we’ll go deeper. As part of the process we’ll also layout some expectation in terms of what we’re trying to achieve. Statistical tools are a means to an end. To avoid spinning our wheels and going down rabbit holes, it’s essential to clearly lay out why we’re running the numbers.

Statistics without guidance is like driving without a destination: you may see some amazing scenery, but if you don’t know where you’re going the trip may bring more confusion than clarity.

Learn To Use R For Portfolio Analysis
Quantitative Investment Portfolio Analytics In R:
An Introduction To R For Modeling Portfolio Risk and Return

By James Picerno