What do the Hyperscalers Want?

In the technology industry we tend to think about the hyperscalers as all-conquering giants. They have scale, negotiating leverage and vast troves of cash. Their power seems almost mythical and conversations about them often veer towards the monolithic. But these are just companies and if we think about them in those terms, we can learn a few important things for the future of semiconductors.

First, it is important to remember that the biggest data center operators are not invulnerable. Prior to the launch of Chat GPt, most people admired Microsoft’s return to relevance in enterprise software, but also assumed that the company would eventually succumb to its own weight, yet now proven it is very capable of innovation. At the same time, the launch of Chat GPT at the very least set a fire under Google and Amazon, if not inspiring outright panic, forcing those companies to move incredibly quickly. These companies are vulnerable, and more importantly, they are acting that way. Perhaps the most impressive accomplishment of all three is the fact that they were able to move that quickly in new directions. Everyone in technology is one bad product cycle away from obsolescence and these companies seem to recognize this and are able to act on it.

Another important concept to remember is that these are companies, which means they have strategies, goals and incentives just like everyone else. When we talk to chip companies about their products, they talk a lot about the hyperscalers, about what features they want. They talk about speed, power and performance. They even talk about use cases. But we very rarely hear discussions of what the hyperscalers hope to achieve with all those things. The old adage in sales is that people do not buy products, they buy solutions to their problems. What problems are the hyperscalers actually facing?

Here we have to disentangle the hyperscalers. There is a tendency to conflate them into one group with a common set of interests, but this is not the case. Google controls its entire software stack and so it makes a lot of sense for them to design their own chips, tailored to those software workloads, all those TPUs and VCUs. By contrast, AWS has to run everyone else’s workloads, and AWS has no control over that. Microsoft sits somewhere in between with both the need to support applications like Office 365 which it does control while Azure has to run a lot of software beyond their control. For those last two, designing custom silicon is much trickier. As we have been saying for a while, designing an internal chip just to save a bit of money paid to merchant chip vendors does not really make economic sense. Roll-your-own chips only makes sense if the chip conveys some form of strategic advantage. That is why Google has done so well with its TPUs while Amazon’s Graviton CPUs are positioned as cost-saving alternatives to compute.

In this context, it is important to bear in mind those strategic goals. Take Amazon as an example. They do so much that it can be hard to piece together what it is they are trying to achieve. They do not give out much information on the status of all of AWS’s product offerings. And at times, we are not sure anyone at Amazon is entirely clear of the big picture direction. But there do seem to be some clear patterns. AWS enjoyed an early first mover advantage, exposing their internal compute processes and internal APIs to external customers got them to scale very quickly. AWS then rapidly, but now seems to have become addicted to that growth. We can see a pretty tight correlation between growth in data centers and AWS revenue. Ultimately, this will prove unsustainable, the company will run out of new markets in which to add availability zones. So over time, the nature of their products has begun to shift.

AWS began by selling raw compute and storage, Infrastructure as a Service (IaaS). Then they began to sell more complete software packages – running particular software applications – becoming a Platform as a Service (PaaS). This is the beauty of AWS’s scale. A small team of software engineers can build a package of software and then deploy it across all of AWS. With this AWS provides a higher degree of service, and so they can charge a premium, but the underlying costs are essentially the same as for IaaS.

Ultimately, the company would like to go further and sell working applications, Software as a Service (SaaS) which brings even higher prices and thus margins. AWS has a long history of seeing what its customers are running and then launching their own version of that. The best known example of this is Netflix and Prime Video. We have to think that Amazon looks at companies like Snowflake, making a lot of money on AWS infrastructure, and think “We can do that too…”

Of course, moving from PaaS to SaaS is not quite so easy. SaaS is already a crowded field, with complex sales mechanics. Moreover, AWS has to be careful to not tread too closely on its customers’ toes. Ideally, they would find a whole new field of software in which they could compete in a green field.

It is 2024, so of course this brings us to AI. Here is a whole new class of software in which no one dominates, no one even has a clear picture of what dominance would look like. So this is a market where Amazon could compete without immediate risk of losing a major customer. Moreover, it is also a product for which they could control their own software and thus design a highly performant chip, giving them some form of competitive or economic advantage. And even if all that is too hazy, the economics of being able to diversify off of Nvidia’s currently very expensive chips may be enough to justify their own chip. This could become a major strategic win for the company.

That all being said, there is a lot that could go wrong. For one, AWS is not alone in pursuing this. Microsoft’s Azure has a similar set of objectives and they also now have their own AI accelerator. A bigger problem is that there are still so many unknowns in the AI software realm. Today’s hot chip could be obsolete by the time it gets back from the fab and is in full production.

We have no idea what AWS is actually working on in this field, beyond their intentions to keep advancing their Tranium and Inferentia chips. We are really driving at the point that the industry could do with a better understanding of what the hyperscalers really need from their custom silicon.