The rapid rise of ChatGPT exposed OpenAI to a myriad of scaling issues and security threats, leading to some unique mitigation approaches.
When OpenAI decided to make its generative AI-powered ChatGPT tool available to the public, they had an inkling that it would be popular, but not that it would quickly become the world’s fastest-adopted consumer software application, spurring the likes of Google and Meta to quickly create competing products.
But OpenAI’s meteoric rise was not without issues. A global bottleneck of GPU availability and some cybercriminal attention, of note. Speaking at LeadDev West Coast, Evan Morikawa, engineering manager at OpenAI, shared how ChatGPT handled these early scaling challenges, and the steps they took to ensure user safety.
LLM risks: Security, abuse, and misinformation
ChatGPT has over 100 million users and its website generates over one billion visitors per month, and not all of them are innocently using the tool to summarize an essay or write a proposal.
The opportunity to use large language models (LLMs) as a weapon is a known risk, but ChatGPT quickly attracted bad actors looking to weaponize the AI-powered conversational model.
There has been early evidence of cybercriminals using these chat-based systems for developing malware code and creating convincing spear phishing emails. Research has also uncovered attempts to “jailbreak” the safety measures built into popular LLMs.
Morikawa says that OpenAI is dedicated to preventing the amplification of misinformation campaigns and abuse in any form, but recognizes the difficulty of staying one step ahead of bad actors. “If we [OpenAI] flat out blocked those attackers, they’d immediately know and adapt,” he said.
Instead, another avenue has been explored to lessen the threat of criminal behavior, such as turning to external researchers to find ways the system could be taken advantage of and adding the appropriate guardrails before releasing its latest version.
Fig.1. “Bootleg” responses to user prompts that may be hostile in nature
In response to one set of attackers, members of the OpenAI security team served cat-themed API responses so they would know that they had been caught out. Some of the attackers took “CatGPT” in their stride, with one even giving the OpenAI team pointers on how to respond next time.
Fig.2. A disgruntled attacker with a suggestion for OpenAI’s next bootleg responses
OpenAI believes that an iterative approach with users has been one of the most important ways to identify and fix safety concerns. But that could all change. “While ChatGPT can alleviate the abuses of yesterday, the future abuse, safety, and alignment challenges are going to be a lot harder,” Morikawa said. “Our vigilance here does need to increase exponentially. That being said, we know it’s impossible for only us and our researchers to identify all possible avenues of abuse and misuse.”
You can’t fight an enemy you can’t see, so, for now, ChatGPT will continue with its safety efforts. And as the risks develop in complexity down the line, which Morikawa is certain they shall, OpenAI will invest more to combat it. “The safety mission [at OpenAI] is so core though that this will be a continued area of focus going forward.”
GPUs: The answer or the problem?
In the early days of its release, ChatGPT had another big issue to grapple with: graphics processing units (GPUs), whose finite supply and intricate nature exacerbated their scaling challenges.
“Asking ChatGPT to summarize an essay has vastly different performance characteristics than asking it to write one,” Morikawa explained. These variabilities led to bottlenecks that popped up in Whack-a-Mole fashion, without a pattern, making it difficult for GPU chip manufacturers to meet their needs right. When creating chips, it can be difficult to optimize for memory bandwidth – a measure of the data transfer speed between a GPU and the system – which drastically affects computation performance and “limits the value of the gains of these new GPUs,” Morikawa said.
This side effect impeded certain launches and product features as the GPU capacity wasn’t sustainable, “The growth that ChatGPT saw could have been even bigger and faster if we weren’t actually limiting usage due to a finite supply of GPUs.”
Fig.3. GPU progression
There are predictions that ChatGPT will require over 30,000 Nvidia graphics cards in the future, but for now, the company is focusing on how to use its GPUs more efficiently.
Having a team that can “jump across the stack and adaptively focus on the constraints of the system has proved incredibly useful.” And, crucially, focus on the lowest levels of implementation has been incredibly important. “For as much as I like to think of this as a black box that takes text in and spits slightly smarter text out the other side. In reality, the more people that dove really deep into the details of the box, the better we became.” These approaches allow OpenAI to stay on top of product decision changes and architectural shifts.
Small teams, big wins
When ChatGPT launched, the engineering team constituted 30 people. Morikawa said, “We’ve been trying to stay small for as long as possible to maintain that iterative, scrappy, get-stuff-done culture.”
ChatGPT has intentionally modeled itself on a 10-month-old start-up. And this has been an effective approach thus far, leading to a high sense of ownership, while interdependencies and the number of processes are low. Overall, the approach has instilled a nimbleness in the team and the impact is far-reaching.
However, it also means that a department that would have consisted of hundreds of engineers in another company is being held up by a fraction of that number at OpenAI meaning that tech debt has been a heavy trade-off. Morikawa didn’t foresee this as a large problem, stating that they are “starting to invest more in these pan engineering platform teams to get ahead of some of this,” while also skewing heavily to build over buy.
“I expect us to try and continue this kind of fractal start-up pattern as new product categories emerge going forward,” Morikawa said.