In recent decades, a set of distinctive rituals has emerged in finance around the phenomenon known as “Fedspeak”. Whenever a central banker makes a comment, economists (and journalists) rush to parse it while traders place investment bets.
But if economists at the Richmond Fed are correct, this ritual could soon change. They recently asked the ChatGPT generative AI tool to parse Fed statements, and concluded that it “demonstrate[s] a strong performance in classifying Fedspeak sentences, especially when fine-tuned.” Moreover, “the performance of GPT models surpasses that of other popular classification methods”, including the so-called “sentiment analysis” tools now used by many traders (which crunch through media reactions to predict markets.)
Yes, you read that right: robots might now be better at decoding the mind of Jay Powell, Fed chair, than other available systems, according to some of the Fed’s own human staff.
Is this a good thing? If you are a hedge fund hunting for a competitive edge, you might say “yes.” So too if you are a finance manager hoping to streamline your staff. The Richmond paper stresses that ChatGPT should only be used currently with human oversight, since while it can correctly answer 87 per cent of questions in a “standardized test of economics knowledge”, it is “not infallible [and] may still misclassify sentences or fail to capture nuances that a human evaluator with domain expertise might capture”.
This message is echoed in the torrent of other finance AI papers now tumbling out, which analyse tasks ranging from stock picking to economics teaching. Although these note that ChatGPT could have potential as an “assistant”, to cite the Richmond paper, they also stress that relying on AI can sometimes misfire, partly because its data set is limited and imbalanced.
However, this could all change, as ChatGPT improves. So — unsurprisingly — some of this new research also warns that some economists’ jobs might soon be threatened. Which, of course, will delight cost cutters (albeit not those actual human economists).
But if you want to get another perspective on the implications of this, it is worth looking at a prescient paper on AI co-written by Lily Bailey and Gary Gensler, chair of the Securities and Exchange Commission, back in 2020, while he was an academic at MIT.
The paper did not cause a huge splash at the time but it is striking, since it argues that while generative AI could deliver amazing benefits for finance, it also creates three big stability risks (quite apart from the current concern that intelligent robots might want to kill us, which they do not address.)
One is opacity: AI tools are utterly mysterious to everyone except their creators. And while it might be possible, in theory, to rectify this by requiring AI creators and users to publish their internal guidelines in a standardised way (as the tech luminary Tim O’Reilly has sensibly proposed), this seems unlikely to happen soon.
And many investors (and regulators) would struggle to understand such data, even if it did emerge. Thus there is a rising risk that “unexplainable results may lead to a decrease in the ability of developers, boardroom executives, and regulators to anticipate model vulnerabilities [in finance],” as the authors wrote.
The second issue is concentration risk. Whoever wins the current battles between Microsoft and Google (or Facebook and Amazon) for market share in generative AI, it is likely that just a couple of players will dominate, along with a rival (or two) in China. Numerous services will then be built on that AI base. But the commonality of any base could create a “rise of monocultures in the financial system due to agents optimizing using the same metrics,” as the paper observed.
That means that if a bug emerges in that base, it could poison the entire system. And even without this danger, monocultures tend to create digital herding, or computers all acting alike. This, in turn, will raise pro-cyclicality risks (or self-reinforcing market swings), as Mark Carney, former governor of the Bank of England, has noted.
“What if a generative AI model listening to Fedspeak had a hiccup [and infected all the market programs]?” Gensler tells me. “Or if the mortgage market is all relying on the same base layer and something went wrong?”
The third issue revolves around “regulatory gaps”: a euphemism for the fact that financial regulators seem ill-equipped to understand AI, or even to know who should monitor it. Indeed, there has been remarkably little public debate about the issues since 2020 — even though Gensler says that the three he identified are now becoming more, not less, serious as generative AI proliferates, creating “real financial stability risks”.
This will not stop financiers from rushing to embrace ChatGPT in their bid to parse Fedspeak, pick stocks or anything else. But it should give investors and regulators pause for thought.
The collapse of Silicon Valley Bank provided one horrifying lesson in how tech innovation can unexpectedly change finance (in this case by intensifying digital herding.) Recent flash crashes offer another. However, these are probably a small foretaste of the future of viral feedback loops. Regulators must wake up. So must investors — and Fedspeak addicts.