Nvidia pushes Jetson as edge hub for open AI models

Wed, 11th Mar 2026

Nvidia is positioning its Jetson range as a go-to platform for running open-source generative AI models on edge devices, as developers move more speech, vision and robotics workloads out of data centres and into machines operating in the physical world.

Jetson systems are increasingly being used for models such as Nvidia Nemotron, Cosmos and Isaac GR00T, alongside community models including Qwen, Gemma, Mistral AI and GPT-OSS. The Jetson family spans modules from Orin to the newer Thor platform.

Edge shift

Open models have typically run in data centres, where compute and networking are easier to scale. That approach can add latency and recurring compute costs that rise with usage. Edge deployments prioritise low latency and predictable behaviour, since systems interact with people and environments and often operate under tight power limits.

Nvidia also pointed to supply and integration constraints. Memory shortages have pushed up costs across parts of the industry. Jetson packages compute and memory in a system-on-module format, reducing the need for discrete components during product design and validation.

Nvidia described Jetson Orin Nano 8GB as an entry point for smaller generative AI models. Higher-end systems, including Jetson Thor, target real-time inference in industrial and robotic settings.

Industrial demos

One example is a Caterpillar demonstration shown at CES earlier this year. Caterpillar used a compact Cat 306 CR mini-excavator that weighs just under eight tons and fits inside a standard shipping container. The cab is small and the controls require training, making operator guidance a key target for automation and assistance.

In the demo, the Cat AI Assistant ran on Jetson Thor. It used Nvidia Nemotron speech models for voice interaction, while Qwen3 4B handled interpretation and response generation. Nvidia said the setup ran locally through vLLM and did not require a cloud connection.

The in-cab assistant remains in development. Nvidia said it runs speech and language models locally alongside machine context, supporting operator guidance and safety features.

Developer tooling

Nvidia also highlighted community projects that package open models into local assistants. One example is OpenClaw, which it said can run on Jetson kits, giving developers a way to build private, always-on assistants on-device without paying API usage fees. Jetson developer kits support OpenClaw across model sizes from 2 billion to 30 billion parameters, according to Nvidia.

Nvidia said a local assistant can handle tasks such as briefings, automation, code review and smart home control. The pitch reflects a broader trend in consumer and industrial applications, where sensitive data and response times can favour on-device processing over remote calls.

Robotics focus

Nvidia cited several robotics demonstrations as evidence that larger, more complex models can run on embedded systems. Franka Robotics presented its FR3 Duo dual-arm system running the Nvidia GR00T N1.6 model end-to-end onboard, from perception through motion, without task scripting. Nvidia said the policy executes locally.

In research, Nvidia's GEAR Lab has been developing a humanoid controller under the SONIC project. Nvidia said the project trains a controller on more than 100 million frames of motion-capture data and deploys the policy on a physical robot. The kinematic planner runs on Jetson Orin at about 12 milliseconds per pass, while the policy loop runs at 50 Hz, according to Nvidia.

In the developer community, a team from the University of Illinois Urbana-Champaign's SIGRobotics club built a dual-arm matcha-making robot on Jetson Thor running the GR00T N1.5 model. Nvidia said the project took first place at an embodied AI hackathon.

At New York University's Centre for Robotics and Embodied Intelligence, researchers recently ran the YOR robot on Jetson Thor. Nvidia said the group used Nvidia Blackwell compute for heavy processing related to AI-driven movement, and reported early pick-and-place results with improved generalisation to new objects and greater robustness to scene variation.

Model mix

Nvidia is also using Jetson to argue for running multiple open models and frameworks on a single edge platform. It pointed to Jetson AI Lab as a hub for benchmarks and tutorials, and said Jetson supports frameworks including Nvidia TRT, llama.cpp, Ollama, vLLM and SGLang.

Models referenced include Gemma 3, which Nvidia said is multimodal and supports a long context window on Jetson Thor, and OpenAI's gpt-oss-20B, which it said can run locally on Thor and Orin. Nvidia also listed Mistral's open model family and cited performance figures using vLLM containers on Jetson Thor, alongside model families such as Qwen 3.5, Cosmos, Nemotron, Isaac GR00T and Physical Intelligence's PI 0.5.

Nvidia also highlighted independent work using Jetson for agent systems. It pointed to Andrés Marafioti, a multimodal research lead at Hugging Face, who built an agentic AI system on Jetson AGX Orin that routes tasks across models and schedules work. It also cited Collabnix community developer Ajeet Singh Raina, who ran OpenClaw on Jetson Thor for a continuously running personal assistant that manages emails and calendars through a local gateway.

"Go to sleep. Everything will be ready by morning," said Marafioti.

Nvidia said developers can fine-tune open models for specialised physical AI agents and then deploy them into physical AI systems on Jetson hardware.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google