Forecasting has a battered reputation in finance and economics, and for all the obvious reasons. But that doesn’t change the simple fact that you just can’t avoid prediction when it comes to macro and markets. Thinking otherwise is delusional. Investing is inherently an act of forecasting. The only reason to buy (or sell) an asset: the presumption that the price will rise (fall) at some point in the future. That leaves us to ponder why prices might rise or fall. A reasonable place to start: the business cycle. In any case, you’ll need a model of one form or another to develop some intuition for another question: Will prices rise or fall (or will the state of macro change)? There are lots of models to choose from, but the challenge is interpreting the raw data. That’s where a probit regression (or its close cousin: logit regression) can help.

Regular readers know that I routinely use a probit model to transform the raw data of the Economic Trend Index (a measure of the US business cycle) into recession-risk probabilities (see Monday’s update, for instance). I often receive questions about how a probit model works. Here’s a brief tour of this quantitative tool, which I use quite a lot for interpreting the signals for a variety of models.

As a simple example, let’s review the Treasury yield curve as a predictor of recessions. In particular, I’ll follow the outline in Estrella and Trubin’s 2006 paper (“The Yield Curve as a Leading Indicator: Some Practical Issues”). The premise is that the yield curve tends to go negative (short rates above long rates) in advance of or in the early stages of recessions. That, at least, is what history tells us, although one might wonder if the relationship will hold in the years ahead. In any case, Estrella and Trubin study the connection between a negative curve and the state of the business cycle 12 months later. With some basic programming in R (see the code below), we can easily see how this relationship translates into recession risk probabilities:

The yield-curve model’s estimates of business cycle risk tend to rise sharply during periods that NBER subsequently labels as “recessions,” as shown by the gray bars. Using a real-time data series such as the yield curve—which is immune to revision—is helpful because it’s timely and so it serves as a rough guide for anticipating the NBER’s future decision on dating the start of a new downturn.

The econometric machinery that produced the graph above, as per Estrella and Trubin (2006), is based on running a regression on the historical relationship between the 12-month-lagged yield curve signal against the current month’s NBER recession signal (0 = no recession, 1 = recession). Plugging this historical relationship into a probit model spits out probabilities that the recession signal will be 0 or 1. This is valuable because at any given point in time we have Treasury yield data, but the true state of the business cycle is unknown. In time, NBER will declare the start of a recession, but well after the fact. Therein lies the power of using a probit model to interpret the yield curve signals: a real-time *estimate* of what NBER will tell us down the road.

All the usual caveats apply, of course, starting with the garbage-in-garbage-out warning. And let’s also recognize that a probit model is only useful for analyzing binary outcomes. In the example above, the goal is deciding if the recession signal is likely to be 0 or 1.

Can you use a probit model for analyzing expected returns for an investment strategy? Yes, but only in the sense that you’re looking for a binary result. The standard example is asking if a given investment model is projecting a positive or negative return for some future time horizon. Meb Faber’s widely cited tactical asset allocation strategy is an obvious candidate, for instance (“A Quantitative Approach to Tactical Asset Allocation”).

There’s a wide array of possible uses for a probit model in macro and finance, but keep in mind that you still can’t get blood out of a stone. Even a relatively robust foundation like the Treasury yield curve is suspect because the resulting recession risk probabilities rely on one factor. If that factor’s relevance fades, even temporarily, the probit model’s output becomes useless. For that reason, I prefer to use a multi-factor design for estimating recession risk probabilities via the Economic Trend and Momentum indexes. As such, I’m diversifying the risk that any one factor will stumble. Nonetheless, the yield curve offers a clear illustration of how the probit model works and why it can help us interpret the raw signals generated by an investment or economic model.

Finally, here’s the basic code for running the yield curve example above in R. You can crunch the data in Excel if you prefer (see page 3 in Estrella and Trubin for details), although the analysis is far easier (and faster) in R.

# R code for generating Treasury Spread Estimated Recession Probabilities chart

# Load packages

library(quantmod)

library(TTR)

library(zoo)

library(ggplot2)

library(reshape2)

# List tickers

fred.tickers <-c("USREC",

"GS10",

"TB3MS")

#Download tickers

getSymbols(fred.tickers,src="FRED")

# Generate monthly yield curve data

GS10.TB3MS <-GS10-TB3MS

# Identify last monthly date for yield curve data

last.date <-tail(index(GS10),1)

# Generate lagged yield curve data

GS10.TB3MS.1yr.lag <-lag(GS10.TB3MS,12)

GS10.TB3MS.a <-tail(as.numeric(window(GS10.TB3MS.1yr.lag, end=(last.date))),600)

# Generate NBER recession signal data

USREC.A <-tail(as.numeric(window(USREC, end=(last.date))),600)

USREC.a <-tail(data.frame(window(USREC, end=(last.date))),600)

# Generate probit model data

treas.sp.p <- glm(USREC.A~GS10.TB3MS.a,family=binomial(link="probit"))

treas.sp.p.f <-predict(treas.sp.p)

treas.sp.pct <-pnorm(treas.sp.p.f)

# Generate data for chart

# Treasury yield curve

date <-as.xts(index(tail((window(GS10.TB3MS, end=(last.date))),600)))

treas.probit.1 <-merge(date,treas.sp.pct)

date.a <-as.xts(index(tail((window(USREC, end=(last.date))),600)))

usrec.1 <-merge(date.a,USREC.A)

probit.dates <-tail(treas.probit.1,600)

dates.1 <- index(probit.dates)

n.1 <-length(dates.1)

z.1 <-tail(dates.1,1)

b.1 = data.frame(dates.1, treas.probit.1)

c.1 = melt(b.1, id.vars = "dates.1")

# US Recession Signal

probit.dates.rec <-tail(usrec.1,600)

dates.1.rec <- index(probit.dates.rec)

n.1.rec <-length(dates.1.rec)

z.1.rec <-tail(dates.1.rec,1)

b.1.rec = data.frame(dates.1.rec, usrec.1)

c.1.rec = melt(b.1.rec, id.vars = "dates.1.rec")

# Generate chart

p1 <- ggplot(c.1.rec, aes(x = dates.1.rec, y = value)) + theme_bw(12)+

geom_bar(stat = "identity",colour="gray",alpha=0.1) +

ggtitle("Treasury Spread Estimated Recession Probabilities") +

theme(legend.position="none") +

theme(axis.title.x = element_blank()) +

theme(axis.title.y = element_blank())

p1 <-p1+geom_line(data=c.1, aes(x=dates.1, y=value), colour="blue")

p1

*# End
*