Comment Nvidia is the uncontested champion of AI infrastructure — at least in the datacenter. In the emerging field of AI PCs, things aren’t so clear cut.
In early 2024, it became plain that, for better or worse, the future of Windows would be imbued with AI-augmented features and experiences. Headline features included live captions and translation, image generation in MS Paint, and, eventually, the somewhat dubious Recall feature that captures periodic screenshots and uses them to keep track of past activity.
For the moment, these new features are exclusive to so-called Copilot+ PCs, but in order to qualify for that designation, the computers have to meet Microsoft’s minimum performance targets.
According to the Windows titan’s documentation, Copilot+ PCs require a neural processing unit (NPU) capable of 40 or more TOPS, or 40-plus trillion INT8 AI operations per second, along with at least 16GB of RAM and 256GB of storage. When this all launched, only Qualcomm had a processor capable of meeting Redmond’s NPU requirement, and so only PCs with that silicon were allowed as Copilot+ PCs to run the aforementioned AI-augmented features.
Since then, Qualcomm’s qualifying Arm-compatible X chips were joined by Intel’s Lunar Lake and AMD’s Strix Point and Halo processor families as Copilot+ PC compliant.
Yet, somehow, a $2,000 Nvidia RTX 5090, as announced at CES 2025 this month, with more than 3.3 petaFLOPS of AI compute (that’s at FP4 by the way) still isn’t good enough for Redmond. No matter how many FLOPS or TOPS your GPU can muster, it only matters to Microsoft if it’s an NPU churning them out — for now anyway.
Nvidia hasn’t been slacking on the AI PC
Much of the marketing hype around AI PCs has revolved around Microsoft’s Copilot+ spec and understandably so. Nearly every PC sold today runs Windows. This dominance of the PC software ecosystem makes Microsoft’s obsession with NPUs hard to ignore, but, that doesn’t mean Nvidia has been resting on its laurels, content with lording it over the datacenter, workstation graphics, and discrete gaming GPUs.
In fact, Nvidia has been working to bring AI features to the PC for years, Jesse Clayton, who leads product marketing for Windows AI at Nvidia, told The Register.
“We kind of started the movement with AI on the PC back in 2018 when we launched the first GeForce GPUs and Nvidia GPUs with dedicated AI hardware — our tensor cores,” Clayton said. “Along with that, we announced the first widely deployed PC AI, which was DLSS, which is used in games to accelerate frame rates by using AI to generate pixels and now generating frames for the games.”
Since then the GPU giant has rolled out the RTX AI Toolkit, a suite of tools and software for optimizing and deploying genAI models on Windows PCs, brought Nvidia Inference Microservices (NIMs) to PCs, and rolled out a number of blueprints for things like state-of-the-art image generation, and converting PDFs to podcasts.
“Our strategy is where we can deliver interesting and differentiated experiences, either as a gamer because it enhances your gameplay, or as a creator because it saves you time and reduces the repetitive, tedious work,” Clayton explained.
And, while some of these experiences are aimed directly at end users — ChatRTX and RTX Voice, for example — many of Nvidia’s more recent software launches have been aimed at the developer community.
Competition or opportunity
Say what you will about Copilot+’s actual value, Microsoft has successfully forced chipset designers to offer some form of NPU that satisfies the Windows giant while also setting a new minimum standard for machine-learning performance.
Considering Windows’ market share and Microsoft’s ongoing efforts to shoehorn AI into every corner of its software, it’s only a matter of time before NPUs trickle down to even the lowliest of budget builds.
What’s more, adoption of frameworks such as Microsoft’s DirectML and ONNX Runtime have helped to simplify application development and allow code to run across a diverse set of hardware with minimal retooling.
Just as your LLM once again goes off the rails, Cisco, Nvidia are at the door smiling
READ MORE
This poses a potential problem for Nvidia. The Silicon Valley goliath may dominate the discrete graphics processor market, surrounded by its CUDA moat, yet its GPUs are only found in about 18 percent of PCs sold with the vast majority of systems using integrated graphics from Intel, AMD, or others.
The case can be made that, before too long, NPUs will become a much larger target for developers building AI apps. And while Nvidia won’t necessarily be left out of the conversation as its accelerators also support many of the more popular software frameworks, at least some of its competitive edge revolves around convincing developers to use its libraries and microservices, which promise easier integration and higher performance and efficiency.
Ultimately, Clayton says, developers will have to make a decision on whether they want to bring their app to market quickly using something like NIMs or if they want to support the largest possible install base.
But, while Nvidia may face competition from NPUs eventually — AI PCs are still a pretty niche market — it isn’t necessarily all bad news. Even if models don’t end up running on Nvidia’s PC hardware, it’s highly likely they were trained on its GPUs.
Even then, Clayton makes the case that NPUs won’t be appropriate for every workload. Forty TOPS is a decent amount of compute, but, as we mentioned earlier, it pales in comparison to the performance of high-end graphics silicon.
“NPUs are going to be where you can run your lightweight AI workloads, and they’re going to be really power efficient,” he said. “A GPU is where you run your more demanding AI use cases, and that’s where we’ve been pushing and focusing our efforts.”
“For stuff that simply doesn’t fit on a PC, you run those on GPUs in the cloud, where you have effectively unlimited performance,” Clayton added.
Additional Microprocessors Decoded: Quick guide to what AMD is flinging out next for AI PCs, gamers, business
Intel debuts laptop silicon that doesn’t qualify for Microsoft’s ‘Copilot+ PC’ badge
Nvidia shrinks Grace-Blackwell Superchip to power $3K mini PC
Jury spares Qualcomm’s AI PC ambitions, but Arm eyes a retrial
GPUs may get some Copilot+ love after all
There’s already some evidence to suggest that Microsoft may extend some Copilot+ functionality to GPUs to support more computationally challenging workloads in the future.
Microsoft didn’t address our questions regarding its plans to harness GPUs. However, in an anouncement from June, Nvidia said it was working with Microsoft to add GPU acceleration for small language models (SLMs) via the Windows Copilot Runtime.
The tech was supposed to materialize by the end of 2024, though Microsoft’s own docs — last updated December 5 — make no mention of GPUs and specifically cite NPUs as a requirement for its as-yet unavailable Phi Silica project for SLMs.
Clayton declined to provide any updates on the collaboration, saying that “ultimately, it’s Microsoft’s decision for where they’re going to run which workloads.”
If and when Microsoft chooses to embrace GPUs for local AI may ultimately come down to hardware availability. As of writing, the number of NPU-toting Copilot+ PCs with dedicated graphics is rather small.
On the desktop, the situation is even trickier. Desktop chips with NPUs do exist, but none of them — at least to our knowledge — meet Microsoft’s 40 TOPS performance requirement. We don’t expect it’ll be long before beefier NPUs make their way into desktop silicon. All it would take is Intel or AMD finding a way to cram the NPUs from their mobile chips into a desktop form factor. ®
GIPHY App Key not set. Please check settings