There are many terms thrown around cloud computing, such as cloud-based, cloud-first, cloud-agnostic, and most recently cloud-native. Unfortunately, different companies are using the terms differently and the terms’ meaning may be different depending on whether they’re referencing hardware or software. However, Ampere has been using the term “Cloud Native Computing” to describe the change in software and the need for new types of processors to run that software, such as the Ampere Altra and Altra Max, for several years. So, I sat down with Ampere’s Chief Product Officer, Jeff Wittich, to better understand Ampere’s concept of cloud native computing. Jeff has been the company’s Chief Product Officer for the past four years and worked on cloud computing solutions for Intel during the previous fifteen years.
(Note: Interview transcript edited for readability.)
JIM MCGREGOR: We’re going to talk today about cloud computing and what cloud computing really needs or more importantly, what is cloud native computing. But, before we get into the details of what cloud native computing is, let’s talk about the problem. What are the key issues Ampere is trying to solve in cloud computing and why are they so important?
JEFF WITTICH: Well, over time the problems that have emerged around how we continue to grow cloud computing, they sort of changed. A couple of years ago, the biggest concerns were really around cost. It might have also been around performance, but, over the last couple of years, with the growing number of data centers and with emerging workloads like AI inferencing and training, the amount of power being consumed by these data centers has become really, really important. Without sufficient power, you can’t build out enough compute capacity in order to serve all of these growing workloads, to supply all of the cool applications and websites out to everyone around the world. And so, when it comes to that future growth, we need to find different ways to build out cloud computing in ways that don’t consume as much power but still deliver as much, if not more, performance than what we’ve had over the last couple of years.
And data centers not only have limited power, but in some cases, data centers also have limited space. It’s becoming harder and harder to build out more data centers, especially in regions like Singapore or in London that are already very, very crowded today. So, power, space, and then even water becomes a key consideration. Water gets used for aspects of the cooling process to run the air conditioning, so even water can become a really scarce resource. We’ve got a big challenge on our hands. Increasing amounts of compute need to be built out, but we can’t continue to consume resources the way that we have over the last decade or so.
JIM MCGREGOR: Now let’s put this into perspective because I didn’t realize how big some of the numbers were. There are at least 8,000 data centers around the world, and they are in some cases, on a country-by-country basis, consuming terawatts of power. In addition, they still account for a relatively small percentage, around 1%, of total global power consumption. But that’s huge, and it’s growing, especially as the data continues to grow exponentially. As a matter of fact, if you look at some of the hyperscalers like Facebook, 90 to 95% of the power consumption that they consume is all in their data center. It’s not PCs, it’s not lighting, it’s not anything else. It’s just those huge data centers. And you mentioned cooling. Obviously environmentals just for the buildings is one thing, but also liquid cooling for the systems is getting really, really expensive. We’ve even seen some data centers being put close to power plants just so they have enough power. It used to be easy to look at that and say, okay, well we build a building and we’ve got it for 20-30 years because we can keep putting new processor generations in there. But that’s running out of steam, isn’t it?
JEFF WITTICH: It is. You hit a lot there. If you look at the overall global power consumption, as you mentioned, it’s about 1% and the industry has done a pretty good job of keeping that number in check over the last decade or so because Moore’s law was still able to somewhat keep up and deliver some gains. And, there were a lot of advancements in things like data center efficiency. People went and deployed really innovative ways of cooling data centers that were very environmentally friendly. But we’ve done all the easy things at this point and left unchecked or left without some sort of new technology, that power consumption is going to start to climb.
What’s also really alarming is that power consumption is generally very concentrated in very specific areas. It’s very concentrated in Northern Virginia or in Dublin, Ireland, or in the Bay Area. All these places where network traffic is flowing through. It’s really stressing the power grids in these very specific places.
A really good example is in West London. There’s a big housing shortage in London and there were all these plans to go and build-out additional multi-unit housing complexes there. And when [building] applications started to be received about a year ago, they looked at the available power on the grid and realized that there was actually no excess power left for housing. Data center operators have taken all the excess power. Then they went ahead and just put off all the applications for a decade or more. So, it’s actually having a really, really big impact on people’s everyday lives.
And so, data centers are starting to become more and more unwelcome neighbors in these areas. And you mentioned, people build these data centers, like you said, to operate them for many decades, and if you needed to go and overhaul your whole data center for some sort of new exotic cooling solution in order to power up your increasingly 500 Watt or higher CPUs, or your big, high-powered GPUs, that’s very, very disruptive. That’s not what was originally intended when the data centers were built. So, going in, overhauling a whole data center, especially one that’s actually running today with real life workloads in it, that’s really, really hard to do. So, we need to come up with solutions that can fit into some of the existing spaces and save power.
JIM MCGREGOR: It’s interesting because over the last couple of years, we’ve seen some of the standard CPUs and GPUs used in data center increasing in their TDP, their thermal design power, their upper limits, which is really stressing that even more. Now Ampere has always been focused on cloud computing, and really efficient cloud computing. As a matter of fact, you have coined a term called “cloud-native processor.” Now, I’ve seen some other companies kind of hop on that bandwagon, but they’re using it differently. So, you can give us a better definition of what Ampere means by the term “cloud-native processing?”
JEFF WITTICH: So, the term cloud native has been around for a decade or so, and people have typically used it to talk about what software is doing. So, cloud-native software is software that has adapted to the way that the cloud delivers compute. It’s software that is very distributed, that’s able to take advantage of maybe hundreds, or thousands, or even more processors around a very large, distributed data center. It’s very elastic. It’s resilient. So it scales really, really well. And so that’s been a concept that’s been practiced in the software space for a while, but it’s not a concept that’s really made it’s way into the hardware space.
And that’s what we’ve set out to do at Ampere, is build a processor that’s cloud native, a “Cloud Native Processor.” One that was designed from scratch that delivers compute the way the cloud actually consumes compute. So, compute that is very, very high performance, but it’s very predictable, giving you the same level of performance all the time, regardless of how many users. Compute that’s very scalable, allowing you to scale out to those very, very large, distributed workloads. But, in the context of this conversation especially, it also has to be compute that’s very power efficient because, at the end of the day, whether you care about OpEx [operating expense], or whether you care about environmental goals, or you just care about being able to actually expand your compute footprint, we’re running into these hard constraints around power efficiency. So, we need to think about the way that we’re actually delivering that compute.
Unconstrained compute that continually takes higher and higher TDPs at the platform level to support or performance gains that actually aren’t efficient, where you get less performance gain than the amount of additional power that’s being provided. That just doesn’t work. You know what that means. It means that overall, in a data center, if you are actually delivering less performance per Watt over time, it just means that your overall data center capacity will go down over time because you’re going to end up hitting your data center power constraints.
So, I think some other people have latched on to this concept of Cloud Native Processors, but what they’re doing is quite a bit different than what we’re doing. We took a clean sheet of paper approach. We built a new processor that was specifically designed for this cloud use case. Now, they [our processors] are general-purpose processors, so they run all the applications that run in the cloud, but architecturally, the processor looks very, very different from the legacy processors that are largely x86 processors, those legacy processors that preceded ours. It’s about taking that fresh approach, building everything from the ground up, and then building processors that don’t just have very high core count and very high performance, but are also really, really power efficient so that we can solve the real problem here, which is how to build capacity at scale.
JIM MCGREGOR: Now, one of the trends we’ve seen, especially in the data center, is the move towards what we call workload-specific processing. So, how does that concept of workload-specific processing fit with cloud native?
JEFF WITTICH: What we’ve set out to build is really that workhorse processor. There’s still a need for a general-purpose processor that’s running all the basic code, that’s running your operating system, that’s running all those C programs that you have. It’s running Java. It’s running your web services, databases, even AI inferencing. So, there’s a need for that general-purpose processor, and that’s what we’ve set out to build is that core processor that’s going to sit right at the heart of your data center and is going to run all of those general-purpose tasks, all those general-purpose applications in your data center.
Now there are some spaces where people have chosen to build domain-specific accelerators. Some workloads do lend themselves to a very specific type of processor, and that might be because the data is structured in a certain way or the instructions are operated on in this specific way, GPUs are a great example. Graphics look very different than just basic C code, for instance, and so GPUs are the preferred way for rendering graphics and encoding video. And AI training can be another example here, where the way that AI training models are run means it has a very specific data structure and a very specific type of instruction that’s being run over and over and over again.
If you have a workload that’s very, very specific and that’s gotten large enough, it can make sense to go and attach an accelerator to it, and those accelerators can attach themselves to our CPUs very, very easily. So, even in an environment where someone is running accelerators, you may as well pick a Cloud Native Processor to attach it to, because then you’re going to end up with less power consumption with all the general-purpose processing, leaving you plenty of capacity for everything else.
JIM MCGREGOR: So, when we think of cloud-native processing or cloud-native processors, combined with those obviously domain-specific accelerators, what type of increases in efficiency should we be seeing with each generation?
JEFF WITTICH: Well, since it focuses really on that compute capacity at scale, you’re going to see big increases in the metrics that matter. Maybe I’ll talk a little bit about what metrics really matter.
We talked about the fact that the cloud is running at scale. It’s not about how much performance you’re getting out of a single core, or how much performance you’re even getting out of a single server. Those are details. But, there’s a lot of ways to create the most optimal solution, and those aren’t actually the metrics that matter the most. Those are great legacy metrics that had a lot of meaning back in the pre-cloud world where you weren’t running workloads at scale, where you were stuck running on a two-core processor or a four-core processor. Now you have access to thousands and thousands if not millions of cores. So, when we think about performance, we think performance at scale.
Now you could look at this in many ways, such as the total amount of performance that you get at a data center level. We choose to look at the performance per rack. It’s a nice manageable number. It’s way bigger than a server. We might have 40 or 80 servers in a rack, so it’s nice and big. You certainly have achieved scale, and it has real life constraints because a rack is always the same amount of space and, at least in the specific data center, a rack is always going to consume roughly the same amount of power. So, we like to think of it as performance per rack.
When you look at our processors, our Ampere Cloud Native Processors, they have up to 128 cores. So very, very scalable and they don’t consume very much power. Let’s say you had a web service that needed to service the requests of 1.3 million users at any given time. If we service all those requests, we’d consume 2.8 times less power. You could do it with three times less data center footprint, so three times fewer racks. And, we’re delivering 2 1/2 times more performance per rack. So, we’re talking about big gains here. These are multiple X gains by using Ampere Cloud Native Processors versus what you could do with legacy x86 processors from Intel and from AMD.
JIM MCGREGOR: So, if I had to combine that reduced number of servers or reduced number of racks, increased performance, and everything else, we’re probably in the 20s, 30s, maybe even hundreds [of performance gains) depending on what you’re running?
JEFF WITTICH: Yeah, there’s some big savings here. This is going to allow people to either dramatically reduce the footprint of what they’re currently running today, which then means they have a lot more capacity for emerging workloads, or just be able to more rapidly expand out for growing compute needs. Another a good example right now, there’s been a lot of articles about ChatGPT and how much power that consumes. And, I think I saw some of the estimates are that GPT-3, which is the model on which that GPT is based, took 1,287 megawatts just to train that model. So, imagine now, as those models get bigger and bigger, as you actually start to get into the many hundreds of billions of model parameters, this type of an approach would allow you to train those even bigger models and not actually even consume any more data center space, power, and water than the current iterations of the model. So, we’re going to find another solution here because the growth in power consumption is experiencing trends that could really cause an explosion here.
JIM MCGREGOR: So, if you increase the data set by 100X, you need 100X or more in terms of improvements in performance efficiency. And GPT-3 is huge, but NASA just announced a couple weeks ago that they’re going to be using their data set for training geo foundational models, and that data set is going to be 250 petabytes. I’m still struggling to put that into my mind, but it’s going to be phenomenal. All of this is leading down a new road for data centers. We really are charting new growth, not just because of the data we’re trying to process, but we’re rearchitecting, for the most part, the data center. We have to rearchitect it, the model doesn’t work. So, give me your ten-year view. What changes in the data center in 10 years?
JEFF WITTICH: Well, I think that if you look at the data center level itself, obviously there’s a lot of efficiencies already in there. I think we’ll see across the board PUE [power usage effectiveness]. That’s the measure of the efficiency of the input power to the power that’s actually being used for useful compute. I think we’ll see it become pretty customary that PUE is in the 1.1 to 2 point range. Today, the world class data centers are in that range, but not all data centers are. So, I think we’ll eke out all the efficiencies we can at the data center level. Of course, I think we’ll see more and more usage of renewable energy in data centers, and in doing so, hopefully in a way that the data centers themselves are really consuming that renewable energy versus it just being kind of a shell game with carbon credits. So, I think we’ll see more and more of that.
Hopefully we’ll see Cloud Native Processors deployed pervasively in all of these data centers. That’s going to allow us to just bring down the actual power consumption. We’ll see many, many X gain in performance per rack that will help us fuel out in a really efficient way. And then, if I kind of look at some more extreme spaces, I think we’ll start to see these same types of approaches. Also used for maybe some different compute paradigms that are emerging.
Quantum computing has been around for a while, but we may actually be using it in the next decade or so. Maybe we will be into a place where we achieve quantum supremacy. There are actually some problems that quantum computers can uniquely solve. We’ve partnered with Rigetti, for instance, in the quantum computing space, and we’ve created this hybrid classical/quantum computer, which kind of brings the best of both worlds. So, they have a very scalable approach to quantum computing where they can easily deploy in the cloud just like we have a really scalable approach to general-purpose cloud computing. We’ve been working together on how we can create this hybrid computer system, so you get the best of both worlds. The net benefit being, a lot of performance for a unique set of problems, but actually handled in an extremely power-efficient way. The same types of approaches we’ve taken on the processor, let’s apply it to other types of compute as well across the data center.
The architecture of the data center is changing around the flow of data, the evolution of software around the cloud infrastructure, and the types of workloads being processed. While there is no single way to slay the dragon, the hardware must adapt to the demands and constraints of cloud computing. Developing hardware around the software leads to the most efficient semiconductor and system solutions, which includes cloud-native processors, workload accelerators, and even future technologies like quantum computing. Despite challenges in advancing computing through Moore’s Law, the level of innovation in the technology industry has never been higher. TIRIAS Research expects processing architectures to continue evolving rapidly throughout the next decade to meet the ever-increasing demands for higher performance and more efficient computing like the cloud-native processor from Ampere.
Follow me on Twitter or LinkedIn. Check out my website.
The author and members of the Tirias Research staff do not hold equity positions in any of the companies mentioned. Tirias Research tracks and consults for companies throughout the electronics ecosystem from semiconductors to systems and sensors to the cloud. Tirias Research has consulted for Ampere and other semiconductor and technology companies providing products and services to data centers.