Sometimes you perceive something as a major risk, but the reality exposed by the data just doesn’t live up to your expectations. This is, in fact, what many modelers (myself included) experienced during the pandemic: expectations were constantly challenged by data.
If your prior views are heartfelt or help to define your professional persona, this can be a very confronting experience.
When building a model, the data sometimes forces you to challenge your prior expectations. In some cases, this is fairly trivial – you may expect, for instance, a particular variable to be significant or for a model to pass a particular diagnostic test. Sometimes, in contrast, empirical research forces you to change your entire world view – a shibboleth is built around a particular concept and the data categorically explodes the myth.
Data science can be both exhilarating and discombobulating in equal measures, and one enlightening example comes immediately to mind.
A few years ago, I built a model to forecast used vehicle prices. I explained the structure of the model I wanted to my team and asked them to also build a challenger, so we could conduct a robust validation.
To my chagrin, my team demonstrated that my precious modeling approach had failed. In fact, the model that I had put my heart and soul into formulating performed significantly worse than the humble challenger – built with a far simpler, more classical model structure.
My champion had been thoroughly routed!
Adjusting to the Unexpected
Let’s now explore the choices available to a risk modeler when these types of situations arise. The most humbling aspect of science is the recognition that almost everything you think you currently know will one day be proven wrong. Under such circumstances, the most obvious course of action involves abandoning your prior views and going wherever the data happens to lead.
Instead, let’s imagine that you remain certain that, despite recent experience, future pandemics represent a dire threat to banks. Perhaps we were saved from disaster this time by a particular feature of the economy that is unlikely to be present in future. Alternatively, maybe the next pandemic will be subtly different (not necessarily worse), making it much more dangerous to the financial system.
All this is certainly plausible, but an evidence-based, data-driven approach to risk does not allow for an analysis of future correlations. Simply put, the current evidence suggests, on its face, that pandemics are benign for credit behavior. We can either accept this reality or commit to doing the work needed to demonstrate the threat that future possibilities pose, using hard data.
The only other option is to wait for the next pandemic and see whether the world responds differently.
Taking up the empirical challenge is a hard, but potentially very rewarding, road to hoe. It involves sourcing new data and designing alternative modeling tools to try to tease out the impact of the current crisis on the behavior of lenders and borrowers.
Though your goal in this process is to find hidden evidence that your world view is actually correct, the journey you take will invariably throw up new insights about the behavior of the financial system during these strange days. It’s fair to say, moreover, that these insights would not have been available had you not set out to slay the white whale.
If you can find some kind of kink in the data, it may even be possible to prove (insofar as anything can be proved with statistics) that the pandemic really was a huge threat and that we were just lucky that the financial system dodged the bullet. Of course, nonlinearities are famously difficult to evidence, which is why linear relationships still dominate the empirical discourse.
Indeed, it’s extremely hard to break the hegemony of linear models, because you either need a mountain of data or a history of successful prediction to justify the rejection of simple models in favor of specific, nonlinear formulations.
One day soon, risk analysts will probably be asked to formulate pandemic stress tests. This will be a strange experience, because the evidence suggests that the level of stress should, in general, be mild or nonexistent.
In a sense, though, the level of stress should be irrelevant – the journey should be more important than the destination. A data-driven stress test with few prior expectations should give risk departments a chance to fully reflect on everything that happened in 2020/21 and explore interesting data, wherever it leads.
That’s the thing with prior expectations. Without them, our research would be haphazard and purposeless – but if they’re too strong, they can blunt our ability to learn all the lessons offered by the data.
Ultimately, of course, it’s the data that always decides.