The flash crash of May 6, 2010 has gone down in history as a half-hour period in which U.S. stocks cratered and, for a very short moment, it looked like trillions of dollars were being wiped out in value and the world was going into an apocalyptic deep depression exponentially worse than 1929.
And it was all because a computer made a mistake.
The problem was in the models behind high-frequency traders, who saw aggressive E-Mini S&P contracts and followed suit, exacerbating the problem. Except the signal they got was completely misinterpreted—and the reasons are pretty complicated.
Ultimately, it comes down to the models that these HFTs use to map what will happen in the markets, and the assumptions of those models were wrong in this particular case, resulting in possibly the end of the world (literally!).
A similar glitch happened on January 25 this year, although the reasons why it happened haven’t been fully discovered yet. No matter, because in both cases there is a tremendous opportunity for these math-heavy models to come to extremely false conclusions about market dynamics.
Smaller and less complicated models can have the same problem. A bank with a million retail credit accounts can end up losing a lot of money if the model suggests that certain people won’t default when they actually will. This is why banks invest a lot in constantly improving their risk management protocols, models, software, and so on.
Managing risk ultimately involves acquiring data from the past and using that to provide probabilistic forecasts for future results. This works if the method of analyzing that data is good and there are no exogenous shocks that would make forecasting impossible. Theoretically, such a closed system could be near 100% predictable with the best model.
But even if you have that, you can still end up making huge mistakes. The best model possible will still result in disaster if the data put into the model is not good. How do you know what data is good and what isn’t? That is the realm of statistics, and improving data quality is a rabbit hole one should be paid a lot of money to go down.
This is ultimately the GIGO problem—the “garbage in, garbage out” principle that is at the heart of computer programming. The best code cannot fix garbage inputs, just like the best financial models cannot fix garbage inputs. And that is why this very simple principle can help banks save billions and billions of dollars.