# GTC Taipei: NVIDIA Turned The GPU Keynote Into An AI Factory Rollout

**Plutonous** | June 5, 2026 | 14 min read



Tags: NVIDIA, GTC Taipei, Vera Rubin, RTX Spark, Nemotron, AI Factories, Physical AI, TSMC

---

**TL;DR:** NVIDIA's GTC Taipei at COMPUTEX 2026 was not just another chip event, it was the company's clearest attempt yet to turn AI into an industrial supply chain: **60+ sessions**, Vera Rubin production across **150 Taiwan partners**, **350+ factories**, and **30 countries**, plus RTX Spark Windows PCs with **1 petaflop** of AI compute, **128GB** of unified memory, and Nemotron 3 Ultra as a **550B-parameter** open model for long-running agents.<sup><a href="#source-1">[1]</a></sup><sup><a href="#source-2">[2]</a></sup><sup><a href="#source-3">[3]</a></sup><sup><a href="#source-13">[13]</a></sup> The real story isn't the keynote theatrics. It is NVIDIA moving from selling accelerators to owning the infrastructure grammar for agents, fabs, deskside systems, robots, cars, and hospitals.

NVIDIA used Taipei because Taipei was the point. The company did not merely announce products near the supply chain. It staged the intelligence-era thesis inside the geography that makes modern compute manufacturable. GTC Taipei ran through June 4 at the Taipei International Convention Center, attached to COMPUTEX, with sessions, workshops, demos, and a keynote built around AI factories, scaling infrastructure, agentic AI, and physical AI.<sup><a href="#source-1">[1]</a></sup>

The conventional read is simple: NVIDIA showed more hardware. That read is too small. Vera Rubin, RTX Spark, DGX Station for Windows, Nemotron 3 Ultra, TSMC fab AI, Cosmos 3, Isaac GR00T, Alpamayo, DRIVE Hyperion, and Foxconn healthcare robots are not isolated announcements. They are pieces of the same strategic move: make every major AI workload legible as an NVIDIA platform problem.

> **Why This Matters Now**
>
> AI demand is shifting from chat sessions to continuous agents, long-context inference, simulation loops, physical robots, and enterprise workflows. NVIDIA is arguing that the scarce asset is no longer only a GPU. It is the integrated factory that turns power, memory, networking, manufacturing, runtime security, and model tooling into tokens.


## The Real Story: NVIDIA Is Selling The Factory

Let's be clear: NVIDIA still sells GPUs. But GTC Taipei showed that the company does not want the market to think in GPU units anymore. It wants buyers to think in AI factories, personal AI computers, secure agent workspaces, synthetic data loops, and physical deployment pipelines.

That is a more defensible business than "we have the fastest accelerator this cycle." Accelerator advantages compress. Custom silicon improves. Cloud buyers negotiate. Hyperscalers test alternative stacks. But an AI factory architecture, if it becomes the default, moves the fight from chip price to system economics.

Here is the genius: NVIDIA is reframing the buyer's spreadsheet. The old question was, "How many GPUs can I buy for this budget?" The new question is, "How many profitable tokens can I produce per watt, per rack, per facility, per model workflow?" Once the unit of accounting becomes tokens, the product becomes the whole factory.

> "GTC Taipei was not a victory lap for GPUs. It was a declaration that compute is now an industrial production system."


The uncomfortable truth for competitors is that NVIDIA's moat is no longer just CUDA. CUDA matters, but the bigger lock-in is operational. If the buyer has NVIDIA reference designs, NVIDIA networking, NVIDIA DPUs, NVIDIA agent runtimes, NVIDIA security primitives, NVIDIA simulation tooling, NVIDIA physical AI models, and NVIDIA supply-chain partners, switching the GPU becomes a much larger organizational problem.

That is why Taipei mattered. NVIDIA's keynote was as much about manufacturing credibility as compute ambition. It put TSMC, Foxconn, ASUS, Pegatron, Quanta, Wistron, Wiwynn, and the broader Taiwan ecosystem inside the story. The event's subtext was direct: intelligence is not just trained, it is manufactured.

## Rack Scale: Vera Rubin Makes Tokens The Unit Of Accounting

Vera Rubin was the cleanest example of the new NVIDIA message. The headline was that the platform is ramping into full production to power agentic AI factories worldwide.<sup><a href="#source-2">[2]</a></sup> The strategic detail is that NVIDIA described Vera Rubin as a POD-scale foundation, not as a single chip.

The platform ties together Vera Rubin NVL72 systems, Vera CPU, storage, networking, DPUs, security, and Spectrum-X Ethernet into a five-rack AI supercomputer for agentic workloads. NVIDIA claims **10x agent throughput at scale** compared with Grace Blackwell, which is exactly the kind of metric the company wants the market to use.<sup><a href="#source-2">[2]</a></sup>

Old AI infrastructure was sold as performance. The new pitch is throughput, uptime, deployment speed, power efficiency, and token cost. Spectrum-X Ethernet Photonics makes that explicit: NVIDIA says the co-packaged-optics switch uses **200Gb/s SerDes**, delivers **5x better power efficiency**, **5x longer AI uptime**, and **1.3x faster time to deployment** than networks using traditional transceivers.<sup><a href="#source-2">[2]</a></sup>


What's often overlooked is the supply-chain scale. NVIDIA said the Vera Rubin ramp involves hundreds of ecosystem partners, **150 in Taiwan alone**, across **350+ factories** and **30 countries**.<sup><a href="#source-2">[2]</a></sup> That is not a normal chip launch footnote. That is NVIDIA reminding customers that AI infrastructure is now a procurement, facilities, power, cooling, manufacturing, and operations problem.

The platform also makes security an infrastructure feature. Vera Rubin combines rack-scale confidential computing with BlueField-4 DPUs and DOCA enforcement across the stack, aiming to protect data, agents, context memory, and inference at the factory level.<sup><a href="#source-2">[2]</a></sup> If enterprise AI becomes a fleet of long-running agents touching sensitive systems, that matters more than a benchmark slide.

**350+** — Factories in the Vera Rubin ramp


## Local Agents: RTX Spark And DGX Station Move Onto Windows

The second story was local agents. NVIDIA and Microsoft announced RTX Spark, a Windows PC platform for personal AI agents, built around a Blackwell RTX GPU, **6,144 CUDA cores**, fifth-generation Tensor Cores with FP4, NVLink-C2C, and a **20-core Grace CPU**.<sup><a href="#source-3">[3]</a></sup>

The numbers are intentionally aggressive for a PC: up to **1 petaflop** of AI performance and **128GB** of unified memory. NVIDIA says RTX Spark can run **120B-parameter LLMs** with up to **1 million tokens** of context locally, render **90GB+** 3D scenes, edit **12K 4:2:2** video, generate **4K** AI videos, and play AAA games at **1440p** above **100 frames per second**.<sup><a href="#source-3">[3]</a></sup>

The real story isn't that laptops got faster. It is that NVIDIA and Microsoft are trying to define a new endpoint class for agents. OpenShell and Windows security primitives are the important layer here. NVIDIA describes OpenShell as a runtime that gives users policy control, routes queries based on privacy settings, and disguises personal information before cloud calls when needed.<sup><a href="#source-3">[3]</a></sup>

That is not a gaming feature. That is a trust primitive for always-on agents.

Nemotron 3 Ultra fills the missing model slot in that story. NVIDIA described the new model as a **550-billion-parameter** mixture-of-experts model for long-running agents, with up to **5x faster inference** and up to **30% lower cost** for complex agentic tasks compared with open frontier models in its class.<sup><a href="#source-13">[13]</a></sup> The model card makes the architecture more interesting: Nemotron 3 Ultra is a hybrid LatentMoE model with interleaved Mamba-2 and MoE layers, select attention layers, Multi-Token Prediction layers, **55B active parameters**, **550B total parameters**, roughly **20T** pretraining tokens, and support for up to **1M tokens** of context.<sup><a href="#source-14">[14]</a></sup>

The real story isn't that NVIDIA added another open model. It is that the company is trying to make the agent stack vertically legible: Nemotron for the model, NemoClaw for blueprints and harness integration, OpenShell for policy and privacy controls, CUDA-X as callable skills, RTX Spark for the local endpoint, and AI factories for production scale. What's often overlooked is that Nemotron 3 Ultra was post-trained for agent platforms and harnesses including Hermes Agent, LangChain Deep Agents, OpenClaw, OpenHands, and OpenCode.<sup><a href="#source-13">[13]</a></sup> That makes it less like a leaderboard drop and more like a distribution strategy.

**55B** — Active parameters in Nemotron 3 Ultra


DGX Station for Windows pushes the same idea upmarket. It uses the GB300 Grace Blackwell Ultra Desktop Superchip, connecting a Blackwell Ultra GPU to a **72-core Grace CPU** via NVLink-C2C.<sup><a href="#source-4">[4]</a></sup> NVIDIA says it can run AI models of up to **1 trillion parameters** locally, with up to **748GB** coherent memory, up to **20 petaflops** of FP4 performance, and ConnectX-8 networking up to **800Gb/s**.<sup><a href="#source-4">[4]</a></sup>

This is the PC as a staging ground for the AI factory. Developers build and validate agents locally. Enterprises keep sensitive workflows under Windows governance. Heavy production scales to the factory. While competitors are still debating cloud versus edge, NVIDIA is trying to make local, deskside, and cloud feel like one product family.

## Taiwan: The Venue Was The Message

Taiwan was not backdrop. It was product strategy.

NVIDIA and TSMC announced that TSMC is using NVIDIA accelerated computing and AI across semiconductor design and manufacturing, including lithography, transistor and process simulation, advanced process control, fab operations optimization, defect inspection, and virtual fab planning.<sup><a href="#source-5">[5]</a></sup>

The numbers are revealing. TSMC is using NVIDIA cuLitho for computational lithography, which NVIDIA says delivers a **20-50%** improvement in cost effectiveness or cycle time compared with CPU-based computational lithography at the same cost of ownership.<sup><a href="#source-5">[5]</a></sup> TSMC is also using cuEST for **50x faster** chemistry simulations on average, cuML for large-scale process analytics, H200 GPUs for scheduling computation, Metropolis and TAO for nanometer-scale defect inspection, and Omniverse libraries to build FabTwin, a virtual fab environment.<sup><a href="#source-5">[5]</a></sup>

Here is the uncomfortable truth: the company supplying the GPUs is also trying to optimize the fabs that make the chips that power the GPUs. That loop is not accidental. NVIDIA is positioning itself inside semiconductor manufacturing itself, not only downstream of it.


The TSMC announcement also changes how to read NVIDIA's AI factory story. If the company can sell accelerated computing into fabs, then AI factories are not only for model companies. They are also for the industrial base that manufactures the future hardware cycle.

While competitors pitch cheaper accelerators, NVIDIA was showing a more expansive thesis: the same accelerated computing stack can optimize chip design, manufacture chip infrastructure, run agent workloads, power local PCs, simulate robots, and deploy physical AI. That is a far more ambitious claim than "we have the next GPU."

## Physical AI: Robots Became A Developer Stack

The physical AI part of GTC Taipei was easy to misread because it involved robots, cars, and demos. The important part was the developer infrastructure behind them.

NVIDIA launched Cosmos 3 as an open world foundation model for physical AI, built on a mixture-of-transformers architecture that combines vision reasoning, world generation, and action prediction.<sup><a href="#source-6">[6]</a></sup> NVIDIA describes it as a fully open omnimodel that can understand and generate text, images, video, ambient sound, and actions, with the goal of reducing physical AI training and evaluation cycles from months to days.<sup><a href="#source-6">[6]</a></sup>

Then NVIDIA released open-source physical AI agent skills and tools spanning Omniverse, Cosmos, Alpamayo, Metropolis, Isaac, and Jetson, turning physical AI workflows into agent-executable tasks.<sup><a href="#source-7">[7]</a></sup> The examples are not hypothetical. NVIDIA says Pegatron reduced model training and deployment time by **67%** using synthetic data from a Defect Image Generation skill, Delta Electronics improved defect detection rate by **17%**, Inventec reduced laptop chassis defect-data collection effort by **30%**, and Foxconn boosted first-pass yield by about **3%**.<sup><a href="#source-7">[7]</a></sup>

That is the key transition. Physical AI is not just a robot running a model. It is a data flywheel: reconstruct scenes, generate synthetic examples, train policies, simulate behavior, validate safety, deploy to edge compute, and collect more data.


The Isaac GR00T reference humanoid design made that stack tangible. NVIDIA announced an open humanoid reference design combining a Unitree H2 Plus humanoid, Sharpa tactile five-finger hands, Jetson Thor onboard compute, and Isaac GR00T software.<sup><a href="#source-8">[8]</a></sup> The robot stands nearly **6 feet** tall, weighs **150 pounds**, reaches **75 degrees of freedom** across body and hands, and uses Jetson AGX Thor T5000 with **2,070 FP4 teraflops**, a **14-core Arm CPU**, **128GB** unified memory, and a **40-130 watt** configurable power range.<sup><a href="#source-8">[8]</a></sup>

On the vehicle side, Alpamayo 2 Super extends the story into autonomous driving. NVIDIA says the open model is a **32-billion-parameter** reasoning VLA model for level 4 robotaxi development, scaling from prior **10-billion-parameter** generations and designed as a teacher model that can be distilled into smaller models running on DRIVE AGX Thor inside vehicles.<sup><a href="#source-9">[9]</a></sup> The broader DRIVE Hyperion announcement tied Foxconn, VinFast, Uber, Autobrains, and HUMAIN into a level 4-ready robotaxi ecosystem.<sup><a href="#source-10">[10]</a></sup>

This is the same pattern again: model, simulator, reference stack, edge compute, partners, and deployment channel. The real story isn't one humanoid or one robotaxi program. It is NVIDIA industrializing the path from simulated world to physical machine.

## The Competitive Read: NVIDIA Is Building Around The CUDA Threat

Every year, the market asks the same question: can someone break NVIDIA's GPU dominance? It is a fair question, but GTC Taipei suggests NVIDIA is not waiting for the answer. It is making the question harder.

If the contest is only matrix multiplication, competitors have room. AMD can improve hardware. Hyperscalers can build internal accelerators. Startups can attack inference niches. Model labs can optimize around cheaper silicon. The stack can fragment.

NVIDIA's answer is to move the battlefield. It wants the buyer to evaluate rack architecture, networking, software, security, local-agent endpoint deployment, simulation pipelines, manufacturing capacity, and partner readiness together. That makes the benchmark table less decisive.


There is still risk. NVIDIA's ambition can become complexity. AI factories are expensive. Local agent PCs need compelling software. Physical AI needs safety, reliability, regulatory patience, and much better real-world data. The company can announce reference stacks faster than customers can operationalize them.

But that is the point of the Taipei strategy. NVIDIA is trying to compress the distance between announcement and deployment by carrying the ecosystem with it: OEMs, fab operators, hospitals, robot companies, AV networks, and cloud partners.

What happened at GTC was not a single thing. It was a map.

Vera Rubin turned AI infrastructure into a factory. RTX Spark and DGX Station turned agents into a Windows endpoint problem. Nemotron 3 Ultra gave that agent story an open model anchor. TSMC turned fab optimization into an accelerated computing workload. Cosmos 3 and the physical AI skills turned robot development into an agent-executable pipeline. DRIVE Hyperion and Isaac GR00T turned physical AI into reference platforms. Foxconn healthcare turned the whole idea into hospital operations.


> **The Key Insight**
>
> The most important announcement at GTC Taipei was not a specific model, robot, workstation, or rack. It was NVIDIA making the case that intelligence has a supply chain, and that the company intends to control as many layers of that supply chain as possible.


Let's be clear: this does not make NVIDIA unbeatable. It makes the challenge larger. A competitor now needs more than a better chip. It needs a credible answer for the factory, the endpoint, the model runtime, the developer workflow, the simulator, the security layer, and the deployment ecosystem.

That is why GTC Taipei matters. NVIDIA did not just show what it is building. It showed what it wants the AI industry to become. The GPU era is not ending. It is being absorbed into something bigger, more physical, and much harder to copy.


*Last updated: June 5, 2026*

---

*Source: [LLM Rumors](https://www.llmrumors.com/news/nvidia-gtc-taipei-ai-factory-stack)*
