Chiplet Chat: Q&A With Intel's Tim Wilson, SoC Design Head for 'Meteor Lake' CPUs
Views:
1970-01-01 08:00
At Intel Innovation, a two-day event held in San Jose, Calif., the chip maker made

At Intel Innovation, a two-day event held in San Jose, Calif., the chip maker made official the unveiling of its anticipated "Meteor Lake," its next-generation architecture for client computer processors. We explained, in broad strokes, how Meteor Lake chips, which will debut Dec. 14 under the new Intel Core Ultra brand, are structured, and the general design decisions around the new chips. It's the biggest re-imagining of Intel's processors in decades. But our earlier stories are a mere distillation of many hours of briefings and deep dives, presented by Intel in the run-up to the launch, highlighting the design decisions and the underpinnings of the new silicon.

PCMag got a brief audience with Tim Wilson, vice president of the Design Engineering Group and General Manager of SoC Design at Intel. He has been key to the development of Meteor Lake, and we were able to run through some questions and clarifications around the genesis of Meteor Lake. Our conversation is relayed below, edited slightly for brevity and clarity.

(Credit: Michael Kan)

PCMAG: Tim, maybe start with a brief introduction, and then we can start to ask some questions?

TIM WILSON: Okay, sure. So, happy to chat with you guys. My name's Tim Wilson. I have led the development of Meteor Lake for the past several years, from inception all the way to product launch here in the next couple of months, Dec. 14, as Pat [Gelsinger, Intel CEO] said this morning. Super excited that we're at that stage, super proud of the product, and happy to talk to you about it, answer any of your questions.

PCMAG: So, let's start with chiplets. The stated advantages of Meteor Lake's chiplet-based design, that we gathered from the Meteor Lake briefings, were these three things: (1) the ability to upgrade a portion of a processor without redesigning a whole monolithic die, (2) the ability to bin or pre-test components before integrating them into a larger die, and (3) the ability to relegate different tiles to different manufacturing processes, or even to different manufacturers. Our question is: Are there any other, less-obvious advantages that Intel will gain from moving to chiplet-based designs in Meteor Lake and future designs?

WILSON: You hit on the big three. We've got a lot of very diverse IP [intellectual property] in our modern client SoCs today. And so, co-optimizing all of them on a single piece of silicon with a single transistor and process-node characterization is starting to put tension on our ability to optimize them all simultaneously. So yeah, one big advantage is being able to partner the IP with the process node that has the characteristics that it values most, right? Call it a "product goodness" optimization ability for us to optimize all components of the SoC together, in a way that we couldn't previously without a tiled architecture.

The other big advantage for us, you alluded to: We fundamentally lowered the barrier to introduce new high-value IP on the latest process node in a way that we couldn't previously with a monolithic design. I think the third thing you mentioned, the ability to test them independently. That's more on the manufacturing side, less a customer-visible advantage, but certainly from a manufacturing test capability perspective that's an advantage to us. But in terms of how does it translate to goodness that our customers will see? I think it's optimization and our ability to move faster with our latest cool new stuff integrated into the PC.

(Credit: Michael Kan)

PCMAG: So there really aren't any other major advantages that you need to share or willing to share that, from a designer's point of view, accrue from using this design?

WILSON: I think those are the keys. I'll say from a designer, developer, architect perspective, those are the interesting things. There are a lot more what I'll call boring things around the economics and product cost and that kind of stuff, but in terms of building cool stuff, that's where I see the value proposition being.

PCMAG: Our next question actually touches on one of those things. How does the switch to using this production method affect manufacturing costs and production time? Will adding additional resources require a new interposer?

WILSON: Yeah, how will it impact? I would say there are puts and takes. Each tile is simpler than the overall SoC. That allows us to break down some of the test time, test program, manufacturability components of the SoC as we bring it together. The flip side is that the manufacturing does require our advanced packaging capabilities, Foveros, which I talked about and you guys have seen too, right? That's a more complicated, more advanced packaging technology than we had previously, and that adds a few extra steps to our assembly process. Overall, on net, I view it as a positive. A brand-new product like Meteor Lake, we're bringing everything together at once for the first time. That was a big lift. Now that we have that baseline, I can do that big lift all at once again on my next project generation, or I can do different components and offset that. And we can leverage that in many ways to net benefit, I think, in the end.

PCMAG: Given that all the discussions have been about laptops and mobile, is there anything intrinsic that Meteor Lake might mean for future desktop designs as distinct from mobile?

WILSON: Good question, good question. That's top of mind for a lot of people. I'll just say up front, this architecture will scale top to bottom in our client segment. Desktop all the way down to mobile. With Meteor Lake, we're launching mobile first. And that largely has to do with, if you think back to the keynote that I gave, the four key design principles around which we design the product, those really go toward the mobile segments first. Our most power efficient SoC we ever built...we want that in mobile. That 2x integrated graphics capabilitybringing discrete into an integrated form factor—that goes to mobile. AI applies to both equally. I would say, but because of the value proposition of the product, what we were trying to build, we targeted mobile first. But you will see this architecture scale top to bottom—up to desktop, as well.

(Credit: Michael Kan)

PCMAG: You probably can't get into too much detail about this, butwith the roadmap indicating Intel 3, Intel 20A, and Intel 18A not that far away, what kind of improvements can we expect from these chip nodes, and are they expected to attack different segments of the market? We're asking that because it sounds almost like you're gonna have a little bit of a process logjam...

WILSON: Five nodes in four years is an aggressive plan for sure.

PCMAG: Right, yep.

WILSON: So your question is, will the different process nodes be targeted at different segments and do we have a logjam? I would say to a first order—without diving into the details of course—there's nothing fundamental about any of these process nodes that leads us to target one segment versus another.

I think you will see, just like we're launching Meteor Lake in mobile first; we launched desktop in the past first, a different project. That will be very much a design, architecture, and product-definition decision, not really tied to process node. So I wouldn't interpret anything about the process nodes from which segment we launch first in those process nodes.

Of course, each one of them, you'll find we'll launch some segment first because we don't launch our entire roadmap all at once, right? The markets can't consume that. Our customers can't consume it that fast. So you'll continue to see us, depending on what product we prioritize when, and in which segment of our roadmap [we are in], we might launch desktop first, or we might launch mobile first. There might be server first. But that really is just about kind of the roadmap that we're building, not necessarily, "This process is tuned in for this particular segment."

PCMAG: Fair enough.

WILSON: The other thing I would add: As you know, a couple of those nodes are foundry nodes with IDM 2.0, right? Intel 3, Intel 18A. So those aren't just Intel nodes, those are broad to...we're opening up to the world to come build on these foundry nodes. All the more reason why certainly we're not targeting them at any one segment. They'll be great nodes for the foundry customers, as well.

PCMAG: Just to follow up on that, what about, how does that apply to the Arrow Lake, Lunar Lake, Panther Lake that was mentioned today? Is there gonna be segmentation with those architectures?

WILSON: So those are all client products, right? That slide that Pat was showing was our client roadmap. Then we have a different server roadmap. Beyond that, I'll just say: I'll save some suspense for our launch events in the future about exactly what we'll target first and where in those.

PCMAG: So a little more about the architecture of Meteor Lakea little more under the microscope here. In the new tile design, are there other tiles apart from the Compute Tile that can be binned to lower levels, or are the other tiles subject more or less like pass/fail? Is that actually how it works? Are we misinterpreting how the binning and QA process works?

WILSON: Good question. There are multiple ways to answer it. So...binning is not fundamental to any one of the tiles, I would say. It's more fundamental to the IP. The binning affinity follows the IP, not the tile. Maybe I should say it that way.

Of course, we haven't disclosed all the details about our SKU stack, but you'll see we have multiple configurations of Meteor Lake in the various segments as we launch it—and in future segments coming up, different configurations as well. So you'll certainly see binning configurations....The IA [Intel architecture] cores are the things that everybody bins most, and nothing's happening there. We'll continue to leverage that. But you've seen us release different graphics stacks in different segments in the past. Potentially media or imaging. All of the other blocks have segments that value them and segments that, maybe, they don't. And we're always looking at that. But there's nothing fundamental to any of the tiles that says we can or cannot bin based on that tile. That's more a function of the content and our choices around whether it does or does not make sense to bin the IP or not.

"Intel 3, Intel 18A. So those aren't just Intel nodes...we're opening up to the world to come build on these foundry nodes."

PCMAG: Fair enough. So, on some of the individual tiles that are on Meteor Lake, the Compute and SoC Tiles, we had a few questions. What are the types of real-world applications that would be handled by the new low-power cores on the Low Power Island on the SoC Tile, versus the Efficient cores on the Compute Tile? Does that make sense?

WILSON: Yeah, good question. The way to think about it—maybe I'll start generically, and then I'll go into specifics on what we're hoping to land where—what we have landed where—on Meteor Lake. You would've seen some of this, and we actually had a pretty cool demo in ITT Malaysia. We might have it out here too.

So the P-cores, the E-cores, think of those as, you know, intense computing cores designed for intense computing applications. P-cores, you know: unquestioned, single-threaded performance leadership. Call it lightly threaded workloads that just want the best single-threaded machine, that's our P-core, right?

(Credit: Michael Kan)

We've got a lot of workloads now that are multi-core, many-core types of workloads. Certainly the data center migrating towards the PC. The E-cores in the Compute Tile, and the E-cores when we added them on "Alder Lake" with our first instantiation of hybrid [architecture], brought a big performance boost to those many-threaded workloads, right? One of our pillars on Meteor Lake was to build the most power-efficient SoC ever. And so, in that context, we stepped back and said, "Hey, there's a whole class of workloads today that need a certain amount of IA compute, but nowhere close to the intense computing that our E-cores or P-cores can provide." And that was what birthed the low-power E-cores in the SoC tile. We said, "Hey, we can achieve much better power efficiency for a whole class of workloads that need some compute, but just sufficient compute—and then, what is the lowest possible power in that compute?"

Take the common video-playback streaming scenario. That entire workload is contained and lands on the SoC E-cores in Meteor Lake. That workload is very media-intensive. That media block is in the SoC tile, and it's also memory-intensive—lots of reads and writes to memory. Display as well. Those are all in the SoC tile on Meteor Lake. Having those E-cores in the SoC tile allows us to leave the Compute Tile entirely powered off in that workload. Therefore, we're not spending the power to wake it up, power it down, wake it up, power it down, with a high-performance ring fabric and the LLC [low-level cache] and the cores and everything like that.

These E-cores are similar to—and architecturally, in fact, they're the same architectural E-cores as—what's in the Compute Tile. But we've optimized them, designed them, converged them for prioritizing low power at low voltage as opposed to compute for MT [multi-threading]. And that allows us to run them at a much more efficient point. Video playback, streaming is the one everyone understands and knows. But a whole class of workloads of that type will run on those SoC cores.

PCMAG: That leads into one of our other questions: How far can you push low power video playback? Are all the resolutions and bit rates and refresh rates that are common today supported? Or is there, say, a break point at which you need to shift over to the Compute Tile?

WILSON: There is. So you'll get to test it yourself and see. I think what we demoed was 1080p. You wanna go push...if you've hooked up an 8K display and you've got 24-bit color, my guess is that workload's gonna migrate to something like the E-cores on the Compute Tile there. There's certainly a point at which the workload will migrate to the core to give it the computing power necessary. But the standard, "I've got my laptop, I'm talking to you with a 1080p, or even a 4K, display," I would expect that to run on the SoC items.

PCMAG: Now, would there ever be a situation where the PC might spin up the P-cores, the E-cores, and the low-power E-cores for an "All engines, full throttle!" situation?

WILSON: It certainly can. I would say, yes, that's absolutely a valid workload. These are all IA cores. They show up to the OS as IA cores. You open up your Task Manager, you can see them all. And the demo we actually showed with video playback, we actually did open up the Windows Task Manager and showed the two SoC cores are the only ones that had any activity on them, right? So from a software perspective, they look like a core, just like anything else.

Now that said, I would say, in your multi-threaded, many-threaded workloads, you'll do that a lot on the Compute die. Adding these two cores as your last two cores, which are optimized for low power and not for performance, you're not going to see a significant delta between the first, say, 20 threads you have and the last two threads here that aren't as compute-intensive as the first ones you powered up there. So don't view it as, "Hey, this is how I get an another boost to MT performance!" That's not the purpose of these cores, right? We built the Compute Tile to manage superior MT performance. These are really purpose-built. So they certainly can operate in parallel, but you're not going to see a fundamental performance difference if they are.

PCMAG: Another question on the E-coresor the low-power E-cores, I should say. What other changes are made to Thread Director apart from accommodating the new cores this time around? Does it allocate tests to different cores more granularly, or is there any sort of high-level Thread Director enhancements that you could talk about, given you have this new class of cores?

WILSON: We definitely made several enhancements to Thread Director. I can give you the high-level overview, and then if you wanna go into detail, I'll probably have to pull in Rajshree [Rajshree Chabukswar, an Intel Fellow]. She had one of the presentations at ITT Malaysia and you could talk with her. She's the expert.

But fundamentally...yeah. Whereas you've got a P-core and an E-core, so the OS has two choices. Well, we've now added a third choice for the OS, right? And so we absolutely need to provide scheduling hints, guidance to the OS to classify types of workloads to say, hey, here's your class of workload. That should land on the SoC E-cores. Different type, this should land on your Compute Tile E-cores or Compute Tile P-cores. The purpose of Thread Director is to look at the profile of the workload and help the OS ensure that it gets scheduled on the core that's going to give you the optimal power and performance trade-offs.

A lot of algorithms go into doing that, various trade-offs around at what point do we migrate a workload from the SoC E-cores to the Compute Tile E-cores or even to the P-core, and vice versa. You want to go into the gory details of those workloads, that's where I'll pull in my good friend Rajshree and have her give you all the detail around it. I won't pretend to know all of that.

But at the heart of your question, yeah, there's significant enhancements to Thread Director to help the OS take full advantage of all the capabilities of these cores.

PCMAG: What about the difference between Win 10 and Win 11 under this? Is there a significant difference, if you were to be running Meteor Lake under one or the other? At least with initial laptops, you're probably going to see it on Windows 11 only, right? But eventually, you're going to have some folks...

WILSON: You know what? That's a really good question, and I will have to plead ignorance on that, in the sense that we...obviously, Meteor Lake will run smoothly on Win 10, Win 11. We fully validated both of those. In terms of what are the nuances and differences you see between them? Actually, that's a good question. Let's get back to you on that one.

PCMAG: So the last question we had on the Compute and SoC Tiles: Could you speak briefly to any changes that were made, if any, to the memory controller on the SoC?

WILSON: Sure. So the memory controller, the big change is basically migrating to the latest memory technologies. We support both DDR5 and LP5 on Meteor Lake. We'll have systems with both. Support up to 7467 on LP5 and support next-generation speeds on DDR5, as well. Those are the primary. We made some power-efficiency improvements, as well. Some other improvements, but primarily it was support for the latest-generation memory technologies coming on Meteor Lake.

PCMAG: So, let's move to the NPU. Quick question: Why was the NPU put on the SoC tile as opposed to one of the other tiles?

WILSON: Good question. That really just boiled down to a design decision, an architecture decision. The key point with all of them, and if you follow through the fundamental architecture changes we made to our, what we call our "next generation uncore," right? The SoC chassis. The key piece is that they all have their own independent attach points on the fabric and access to full system memory bandwidth. Beyond that, then the question became, just design decisions about which IP go on which tile. This being the first instantiation of the NPU into our client PC, and the focus being power-efficient capabilities in that NPU, it made sense to put it in the SoC tile.

PCMAG: So nothing like bandwidth or being in proximity to other parts of the die or anything of that sort?

WILSON: No. Again, I would say, yeah, you now have a true AI PC, what I would call true XPU, right? I have three favorite children now: the CPU, the GPU, and the NPU. They all value the same things: low latency to memory, high bandwidth to memory, to differing degrees. But all of them are performance agents. We optimized all of them with that in mind.

PCMAG: So to consumersspeaking to most of our readerswhat type of applications or workflows do you think would most benefit, near-term, from the presence of the NPU that'll touch their everyday computing day?

WILSON: Good question. A couple of places in particular we focused on the experience with Meteor Lake. One is collaboration. The other is content creation. Collaboration, one of the examples that I've described to some folks in the past: Take [Microsoft] Teams, for instance. With the pandemic, it really did fundamentally usher in a new era of online collaboration. Like how we're talking here, right?

"I have three favorite children now: the CPU, the GPU, and the NPU. They all value the same things: low latency to memory, high bandwidth to memory, to differing degrees."

PCMAG: Sure.

WILSON: You guys probably go through like I do—pretty much every workday—I use Teams multiple times a day. And it's fantastic, but it's not ideal. In fact, we were just moving the laptop around here to show people things, and we have this fake background on here. If you've been on Zoom and you've ever had to take an object and try to get it to show up to people, we've all seen someone do the chicken dance to try to get it to show and not fade into the background, right? That's a perfect example. A perfect use case for AI: to take and enhance, to better detect objects, or better, crisper resolution of your fake background in Hawaii or wherever you want to pretend you're sitting when you're dialing into a meeting.

Another one is, you know, Michael, we're looking just at you here [in person], eye-contact correction. [One of our reporters was in the room with Tim Wilson, the others were on a Teams call.] The other common example is you've got is multiple screens, and you're presenting from this screen, but your camera's on this other screen. You're looking at your material. The ability to actually do that translation, follow you around in the [field of view]. There are all sorts of experiences like that in the collaboration space that the NPU, the AI capabilities, and the PC I think are very near-term going to translate from a capability perspective. And then give you that capability at a power-efficiency point where you still have more than five minutes of battery life when you're trying to do AI stuff, right?

"We've all seen someone do the chicken dance to try to get [an object] to show [in Teams] and not fade into the background, right?"

PCMAG: Right, gotcha.

WILSON: That's one key example. The other one is the content-creation space, Pat showed a fun video, where creators can take an audio clip or a sound and say, "Hey, I want to translate this to a different sound!" Or, say I have an image I want to create. I want to create an image of Hawaii, and I want it to have palm trees, and I want it to have Ka'anapali beach, a picture of Ka'anapali beach, and I don't want it to have any ships on the horizon. The AI can create that, right? Whether it's audio, whether it's video, whether it's image. Creators are creative by nature. They're finding all kinds of stuff to do. Those, in my mind, are the two very near-term examples I'd give off the top of my head.

The other thing I would say: It is still a little bit of the Wild West. It seems like every week, every couple of weeks, there's a new example, a new use case, a new, "Hey, look what somebody just did with ChatGPT or with AI!" I think we're at the very beginning of this kind of adventure.

PCMAG: On the same note...what can end users expect regarding the quality and the accuracy or the speed of the AI tools enabled by the NPU, considering that most of the popular LLMs that folks are used to using, at least in these very early days of AI, were initially made possible by cloud computing and server hardware?

WILSON: You're getting into some of the architectural details in AI that are below my depth—we can always bring in experts—but I don't think there's a fundamental trade-off in the accuracy or capability, as much as the computing capability. Yes, a lot of stuff happens in the cloud. Personally, I'm seeing a push and a pull. A push from cloud providers that are now starting to say, "Holy cow! We have a billion people all trying to do this. There's not enough data centers in the world for us to handle this. We got to push some of this to the edge."

(Credit: Michael Kan)

And then a little bit of—well, more than a little—pull from the edge to say, "Hey! I don't wanna have to have access the cloud every time I want to be able to use my AI!" Maybe I'm not even connected to Wi-Fi, maybe I'm working offline. I don't want to actually have to be. Most of the time I'm online, but occasionally I'm on a plane, I'm somewhere where I don't have connectivity. I don't want to be bound by that to be able to use AI.

I don't think the push to the edge or to the cloud is going be a function of accuracy. An LLM model is a function of the algorithm regardless of where it's running. It's going to be more a function of what runs efficiently and well and better at the edge, versus where do I need to do computing in the cloud. That's why I think that this is going to evolve.

"I'm seeing a push and a pull [on AI]. A push from cloud providers that are now starting to say, 'Holy cow! We have a billion people all trying to do this. There's not enough data centers in the world for us to handle this.'"

PCMAG: With the NPU, I think the NPU is separated. It has like two compute engines, is that correct?

WILSON: We've got two tiles. You got two tiles, yeah.

PCMAG: Will that be true for all versions of Meteor Lake, of the Meteor Lake family?

WILSON: So, we'll leave some suspense for when we launch, but I'll say AI will be enabled on—if you buy Meteor Lake, you can trust that you've got AI enabled on Meteor Lake.

PCMAG: The last question: How would you distinguish between what Intel's doing with its NPU and what, say, Apple is doing with its Neural Engine, or AMD with its Ryzen AI? Are they all sort of the same thing or is there a key distinction you'd like to highlight?

WILSON: For a lot of reasons, I'll avoid speculating on what I think our competitors are doing with their AI engines. I'll say simply that our focus with the NPU and AI and Meteor Lake is really around our observations working with the PC ecosystem, working with Microsoft and our software partners on making sure that the AI capabilities and experiences that they're building into their apps will run well on Meteor Lake on Intel PCs. And I would imagine the experiences across the entire ecosystem are, there's similarities and then there's some differences, but speculating on exactly which direction other people are pushing, actually you guys probably do that better than me.

PCMAG: Fair enough. So I think we are over time. Tim, thank you so much for your time. Very good conversation.

WILSON: And my pleasure. It's great to chat with you guys.

Tags processors ai