Model Risk & Validation Risk Modelling Stress Testing

In Search of a Post-Pandemic Modeling Paradigm

The COVID-19 saga has caused real difficulties for risk modelers. Loss projections made using pre-pandemic models soared in mid-2020, as global economic data spiraled downward. Portfolio performance, however, has held up very well under extremely difficult circumstances.

The deviation between predicted and actual credit performance has been analyzed in detail by many commentators. Most cite generous government income-support schemes for furloughed workers and struggling small businesses as the reason for the surprisingly strong performance of most credit portfolios.

Normally, a sharp increase in economic slack is associated with higher delinquency and default. It therefore makes sense to believe that well-timed government stimulus helped mitigate the acute effects of a rapid-onset recession.

Tony Hughes Headshot
Tony Hughes

In my view, however, loss predictions would have been too high, even without government support programs. Analysis of historical recessions suggests that credit performance is always worse when underwriting quality erodes during the preceding expansion, and this did not happen for almost all lending products in the lead-up to COVID-19.

Put simply, recessions with causes external to the financial sector are much safer for banks than those triggered by the misdeeds of borrowers and lenders. Given that the pre-pandemic models were primarily trained on a credit-fueled recession, and then applied to a downturn whose origin was microbial, it is little wonder that model predictions were found to be too pessimistic.

While it is important to consider specific problems with 2019-era models, it is unrealistic to believe that issues like the impact of government support could have been anticipated by modelers. Instead, the question we should address is whether the pre-pandemic model building-and-validation process was lacking in some fundamental way.

The models that emerged from the 2019 vintage certainly missed badly. But would a different modeling paradigm have yielded analytics more suitable for the COVID-19 era and beyond?

Modeling Mentalities: Forecasting vs. Tail-Risk

Credit modeling is, and was, dominated by a forecasting mentality, but a tail-risk mindset is more appropriate. What do I mean by these mysterious terms, and how would this shift to a tail-risk mindset affect conventional model build-and-validation processes?

When your aim is baseline forecasting, you are trying to identify the most likely path for the target variable, irrespective of possible up- and downside-risks. You primarily assess models by considering relative forecast accuracy over recent holdout samples. Centuries of practice has taught us that extremely parsimonious models typically produce the most accurate predictions.

When we apply this structure to scenario analysis, a key component of stress testing, we are faced with a dilemma. We are now trying to predict behavior conditional on a specific economic path, normally a recession of some form, as opposed to the unconditional forecasts previously considered. We would like to assess model performance during historical episodes of a similar ilk to the scenario being analyzed, but this is typically not practical given the available data.

To resolve this dilemma, stress testers validate their models using baseline forecast accuracy as the primary criterion. This is not the correct measure, but, alas, there are no realistic alternatives.

The models used for scenario analysis, unsurprisingly, then tend to mimic baseline prediction tools. The specifications are usually very tight, with only a few explanatory variables. Indeed, it is common to see stress testing models with only two or three economic drivers, and the unemployment rate often takes one of the available seats.

Take this to the logical extreme and imagine a stress testing model that only includes the unemployment rate. Under this hypothetical, two very different scenarios that happen to have the same employment dynamics will yield loss predictions that are identical. If COVID-19-style income-support was offered in one scenario, but not the other, it would not be apparent in the model projections. 

By specifying the model more liberally – perhaps by including some household income drivers, stock market indexes, inflation or government spending, for example – we would be able to paint a far more nuanced picture of different alternative scenarios. We wouldn’t necessarily hit the COVID-19 nail on the head, but it would give us a chance of analyzing different features of the data that may turn out to be relevant to future situations.

A forecasting mindset yields very tight models, whereas a tail risk mindset demands a far more liberal approach to model specification.

A Tragic Modeling Lesson

Every summer, a major off-shore yacht race, the Sydney-to-Hobart, is held in Australia. In 1998, the race was met with tragedy when part of the fleet sailed headlong into the worst storm in the race’s history.

This situation illustrates a number of interesting risk-modeling dilemmas. Here our focus is on the weather forecasters and their communications with race participants.

The forecasters were using eight models, all of which told a different story about the timing and intensity of the storm. Two days prior to the race, the preponderance of predictions suggested strong winds, but did not indicate the formation of a dangerous storm complex. One historically reliable model, however, was more certain that an intense storm would coalesce directly across the racepath.

Every race participant, obviously, was interested in tail risk. So, the fact that the eighth model provided a “something fishy is going on here” moment (i.e., that a dangerous storm was likely to develop) was very important.

One lesson here is that it is often very useful to have many models available when assessing risk. This is especially true if the models are all well-built and attack the problem from different angles or with different datasets. If eight such models agreed that an intense storm was likely, the race would probably have been cancelled by organizers.

Challenge the Prevailing Wisdom

The tragic tale also illustrates the importance of dissent in risk management. Seven sound models suggested that a storm would not occur, but, for risk managers, this would not have been enough. The eighth model was vital, because it rang a warning bell about looming dangers that would have otherwise been missed.

If you have a single model, or several models that are effectively the same, you will never see such a rogue prediction. In risk management, there should always be room for heterodox analytical approaches – so long as they are defensible – that challenge prevailing conventions. I suspect, however, that such models would have failed validation processes (in most jurisdictions) in 2019 – and may have run into the same problem this year.

If a tail risk mindset was adopted by the industry, risk modeling would involve scores of models, all seeking to cast light on different aspects of tail behavior. Some of these would have a narrow focus, considering only certain aspects of borrower behavior. Others – probably most – would be specified very liberally. The models would then be validated collectively, rather than individually.

Risk managers should have a very wide range of opinions at their disposal. Ideally, they should be of high quality. Coming from different angles and using different datasets, they should all add to the story of what’s unfolding.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s