diff --git a/content/essays/scaling_outage.md b/content/essays/scaling_outage.md index 342d2b4..85224c6 100644 --- a/content/essays/scaling_outage.md +++ b/content/essays/scaling_outage.md @@ -10,7 +10,7 @@ tags: # optional; see Tags section - open # Epistemic profile — all optional; the entire section is hidden unless `status` is set -status: "Working model" # Draft | Working model | Durable | Refined | Superseded | Deprecated +status: "Draft" # Draft | Working model | Durable | Refined | Superseded | Deprecated confidence: 55 # 0–100 integer (%) importance: 3 # 1–5 integer (rendered as filled/empty dots ●●●○○) evidence: 1 # 1–5 integer (same) @@ -20,8 +20,10 @@ practicality: high # abstract | low | moderate | high | exceptional confidence-history: # list of integers; trend arrow derived from last two entries --- -Running a lab that developers frontier LLMs is somewhat like playing a game that, by all measurable metrics external, you are bound to lose. The amount of compute required to train a frontier LLM is unbelievably expensive. The expense of inference is even more astronomical. OpenAI claims at the time of this writing to have somewhere between 900 Million and 1 Billion active users, all of whom require some amount of inference cost, and some small subset of whom consume an enormous amount of compute - to use their words, this is ["commercial scale."](https://openai.com/index/accelerating-the-next-phase-ai/). This isn't to mention the immense amount of competition - there are many major players in the United States alone contributing models that push the boundaries. OpenAI may have been the first, but Anthropic, Google, Meta, xAI, and, yes, even Amazon and Bytedance are following right along. +Running a lab that develops frontier LLMs is somewhat like playing a game that, by all measurable metrics external, you are bound to lose. The amount of compute required to train a frontier LLM is unbelievably expensive. The expense of inference is even more astronomical. OpenAI claims at the time of this writing to have somewhere between 900 Million and 1 Billion active users, all of whom require some amount of inference cost, and some small subset of whom consume an enormous amount of compute - to use their words, this is ["commercial scale."](https://openai.com/index/accelerating-the-next-phase-ai/). This isn't to mention the immense amount of competition - there are many major players in the United States alone contributing models that push the boundaries. OpenAI may have been the first, but Anthropic, Google, Meta, xAI, and, yes, even Amazon and Bytedance are following right along. Then there's the news that the stock market doesn't want to hear. Ask yourself: who is deliberately left off the above list? If you're thinking of models like GLM, Qwen, MiniMax, and the notorious Deepseek, then we're on the same page. These models are rapidly approaching the capabilities of the frontier models that remain behind intrusive "competitive moats"^[This phrasing is adopted from Jared James Grogan's 2026 paper ["The End of the Foundation Model Era](https://arxiv.org/abs/2604.06217)] that do little more than violate the rights of their users. The advantages that such models provide are immense, and labs of the first list cannot ignore the likelihood of their precedence increasing in the weeks and months to come. In fact, I hypothesize that we are already seeing the reaction of frontier labs to these increasing capabilities, through the lense of juxtaposition: the jargon has remained constant, as if to negate any possibility of an "AI Bubble" bursting, but the quiet actions of the companies that aren't notoriously announced and decreed have shifted. ## Inference is the Name of the Game +Very few users of an LLM have ever attempted to train an LLM. Even those users who are technical powerhouses - and there are many of these^[Per OpenAI's account, Codex [has reached](https://openai.com/index/accelerating-the-next-phase-ai/) 2,000,000 active weekly users, and while I could not find any specific numbers that Anthropic has released regarding Claude Code's weekly user count, I presume it is higher than that of Codex.] - +likely are not intricately familiar with the inner workings of transformers. Even those who, perhaps from coursework, perhaps from curiosity, perhaps from [a chat](https://claude.ai/share/5282e1b8-24ce-4cf8-983e-55df95f5fbdc) with an LLM of choice have enough technical prowess to in theory write code that could facilitate the training of a naive transformer are unlikely to be able to train any model of substance, due to computational constraints. Consider, for instance, that [over 200,000 GPUs](https://x.ai/news/grok-3) were used to train Grok 3, which is a model from early 2025.