September 10, 2012
A Model For GDP Forecasting
The government's first estimate of the nation's economic growth for the third quarter doesn't hit the streets until October 26. Meantime, the burning question: Does the sluggish 1.7% annualized pace in Q2 for GDP imply more of the same during for Q3? Or will see a stronger reading in the next quarterly report? The world is awash in guesses, and now there's one more. Today marks the start of a new feature on The Capital Spectator: a regularly updated "nowcast" of the next quarter's GDP using standard econometric techniques.
I call it nowcasting because as the variables behind the forecast are updated, the forecast will be revised too.. Over time, I'll publish a chart that compares the evolving forecasts in real time relative to the actual data and so you'll be able to see for yourself how the forecasts compare with reality.
There's more to evaluating the economy than dissecting GDP, of course. But as the broadest measure of U.S. economic conditions in a single data point, and one that receives a fair amount of attention, it's only natural to develop a statistically robust estimate of this influential number for the period ahead. Like every other economic report, GDP should be considered in context with a range of indicators. But the case for starting with this metric has obvious appeal. It's only a beginning, however. In the weeks ahead, I'll introduce a broader suite of data estimates for developing additional perspective on risk premia and the broad economic trend. But I'm getting ahead of myself.
The Capital Spectator's model currently projects a 2.3% real annualized growth for GDP in Q3, up moderately from Q2's 1.7% rise. The forecast draws on analyzing the last 40 years for relationships among individual factors in terms of how it correlates with quarterly GDP changes. The model uses that historical relationship for making predictions, based on the latest data points for each of the underlying variables. I'll get into a few of the details of how the model works, but first some graphical perspective on how the forecast stacks up vs. recent history:
In other words, the model's predicting an improvement in economic growth relative to the last published report for Q2. What's behind this forecast? The engine is a multiple regression of the quarterly percentage changes in GDP with the changes in 10 key economic and financial variables:
• nonfarm private payrolls
• real personal income excluding current transfer receipts
• real personal consumption expenditures
• industrial production
• ISM Manufacturing PMI Composite Index
• housing starts
• initial jobless claims
• the stock market (S&P 500)
• crude oil prices (spot price for West Texas Intermediate)
• the Treasury yield curve spread (10-year Note less 3-month T-bill)
Each of these data series plays a role in economic growth, or the lack thereof, of course. If we're looking to model the ebb and flow of the economy, the 10 indicators above are a reasonable short list. For running the regressions and estimating the coefficients, I used the historical period 1971 through the present. As a complete set of quarterly data becomes available, I'll re-estimate the relationships, although the changes will likely be small from quarter to quarter.
There are, obviously, many more indicators to consider. Alternatively, one could argue that a shorter list is a better way to go. The search for the optimal tradeoff of parsimony vs. a richer read on the economy is a struggle with no obvious answer, but the list above, I believe, is a sensible compromise.
I didn't choose the list lightly. After spending a considerable amount of time crunching the numbers, I settled on the indicators above. The list isn't terribly surprising; any basic economics text will make the case for each of the data series for econometric and theoretical reasons.
Here's a quick overview of what's going on behind the scenes. First, for those who are interested in the technical details, I'm modeling the data via R, the statistical software environment. The first task is transforming the daily, weekly, and monthly data into a quarterly dataset so that it's directly comparable to GDP.
Next, there's the decision of how to measure the changes in the data. Is a quarter-over-quarter change superior to looking back two quarters vs. four quarters in searching for a "good fit" with quarterly changes in GDP? In the end, no one can really say for sure when it comes to deciding what'll work best in the future. As a result, I take an average of the percentage changes for the past one-, two-, three-, and four-quarter comparisons. Two exceptions: the yield curve and ISM Manufacturing Index. The former is calculated as the average spread for each quarter; the ISM Index is evaluated in terms of its difference relative to a neutral reading of 50 and translated into quarterly averages.
Some of the inspiration for this comes from the literature on nowcasting and mixed data sampling (for example, see Evans (2005), and Lahiri and Monokroussos (2011) and their bibliographies). To be sure, the model I'm using is relatively simple, but that's by design. The goals here include transparency and parsimony.
On that note, you may be wondering at this point if I tested various mixes of the ten datasets above? I did. The results? Without going into too much statistical detail, the 10-factor model above compares quite well with a dozen or so alternative combinations of the ten factors. Still, the 10-factor model doesn't exhibit the best fit. But simply choosing the model that does the best job of predicting in the past—i.e., the model with the lowest in-sample errors—isn't necessarily going to be the best model going forward. In fact, there's quite a lot of reasoning for avoiding what worked best in the past when searching for an economically robust model for the future.
The pitfalls of overfitting and other statistical traps are well known, but the key issue comes down to uncertainty. It's never clear which indicators will be relevant (or irrelevant) in the period ahead. As a first approximation of choosing a superior model, there's a strong case for favoring equal weighting and its equivalent in the design of forecasting models. The true optimal set of parameters are always unknown, and it's not obvious that analyzing all the various permutations of variables and model types will lead to better results on a regular basis.
One quick example in the extreme. After analyzing more than a dozen combinations of indicators for the model, I found that one of the strongest possibilities is simply using personal consumption expenditures by itself to forecast GDP. Its F-stat, for instance, is far higher relative to the other multi-factor models I reviewed and its median error residual is considerably smaller. This is hardly a shock, considering that consumer spending represents about 70% of GDP.
But here's the question: do you want to bet the farm on developing intuition about GDP based exclusively on consumer spending? If you could only choose one factor, that's probably the first choice. History reminds, however, that other factors sometimes play a role, perhaps a big role, in driving economic fluctuations. Consumer spending is important, but its influence varies from quarter to quarter. No one knows exactly which factor (or set of factors) will dominate next week, next month, etc., and so there's some logic to modeling a broader set of indicators that are known to have a strong relationship with the economy. The assumption here is that a broad set of relevant indicators will, in the aggregate, dispense useful information about where GDP is headed. The tool for extracting this information, which may not be obvious in any one indicator, is the workhorse of econometrics: the multiple linear regression model.
Still, let's be realistic: the perfect forecasting model in macro doesn't exist—or if it does, its designer is keeping it to himself. As for the Capital Spectator's 10-factor model, it has a reasonably good fit with GDP: the adjusted R-squared is 0.588, for example. But a look at a wider array of statistical metrics reminds that balancing simplicity against complexity in the search for a strong model is as much art as it is science.
Remember too that the goal here is the process rather than a single point forecast. As the 10 indicators are updated with new data, the GDP forecast will change. Once the official estimate is published, we'll move on to the next quarter. Monitoring how these estimates change, and how they ultimately compare with actual data, will provide valuable information for thinking about GDP going forward. But you can't learn much from one forecast at a single point in time.
For now, the obvious question to ponder: Will the 2.3% GDP forecast hold up as new data comes in? Stay tuned....
Posted by jp at September 10, 2012 10:12 AM
1) Macroeconomic Advisers do a similar thing and they usually have a nice accompanying chart showing how their forecasts change over time.
2) It may be better to split things up so that you forecast the major components of GDP and then aggregate them into a GDP growth number.
3) Consumer spending might be 70% of GDP, but most of the variability in GDP growth comes from business and residential investment and changes in inventories.
4) In modeling consumer spending, retail sales come out before monthly consumer spending does. This provides earlier insight.
5) In modeling business investment, core capital goods shipments is a cointegrated with business investment.
Posted by: John Hall at September 10, 2012 12:41 PM