Red Hat unveils AI Enterprise platform with Nvidia tie-up
Red Hat has launched an integrated platform called Red Hat AI Enterprise and rolled out updates across its AI portfolio. It also expanded its collaboration with Nvidia through a jointly engineered offering branded Red Hat AI Factory with NVIDIA.
The moves position Red Hat more squarely in the fast-developing market for production AI systems spanning infrastructure, model serving and emerging "agentic" workflows. Red Hat framed the announcements for organisations that have tested AI in pilots but have not standardised tools and processes across the business.
Red Hat AI Enterprise unifies the company's existing AI products, bringing together Red Hat AI Inference Server, Red Hat OpenShift AI and Red Hat Enterprise Linux AI. It is designed to deploy and manage AI models, agents and applications across the hybrid cloud.
Red Hat described the platform as a "metal-to-agent" stack, linking Linux and Kubernetes layers with inference and agent tooling. Red Hat OpenShift provides the Kubernetes foundation, while Red Hat Enterprise Linux underpins the operating system layer.
The platform includes capabilities for inference, model tuning and customisation, and agent deployment and management. It supports multiple model and hardware options across different environments, and emphasises operational consistency and security controls.
AI 3.3 update
Alongside the new platform, Red Hat introduced Red Hat AI 3.3, a set of updates across its AI portfolio. The release adds more model options and extends support across processor and accelerator roadmaps from Intel, Nvidia and AMD.
In the OpenShift AI Catalog, Red Hat has validated compressed versions of Mistral-Large-3, Nemotron-Nano and Apertus-8B-Instruct. The release also supports deployment of Ministral 3 and DeepSeek-V3.2 with sparse attention. Other updates include multimodal enhancements, a 3x Whisper speedup, geospatial support, improved EAGLE speculative decoding and enhanced tool calling for agentic workflows.
Red Hat AI 3.3 also introduces a technology preview of Models-as-a-Service, allowing IT teams to offer self-service access to privately hosted models through an API gateway. Red Hat positioned this as a way to centralise model access for internal users.
On hardware, Red Hat is adding a technology preview for generative AI support on CPUs, starting with Intel, aimed at small language model inference. It has also expanded hardware certification for Nvidia Blackwell Ultra and added support for AMD MI325X accelerators.
The release introduces a new Red Hat AI Python Index, described as a trusted repository of hardened versions of tools including Docling, SDG Hub and Training Hub. Red Hat also highlighted greater AI observability, providing telemetry across AI workloads, llm-d deployments and Models-as-a-Service cluster and model usage.
Another technology preview integrates NeMo Guardrails, giving developers a way to enforce operational safety and alignment across AI interactions.
Nvidia partnership
Red Hat AI Factory with NVIDIA combines Red Hat AI Enterprise with NVIDIA AI Enterprise. The companies said the joint offering provides an end-to-end option for organisations deploying AI at scale. It is supported on infrastructure from Cisco, Dell Technologies, Lenovo and Supermicro, according to the companies.
The offering includes inference, model tuning and customisation, and agent deployment and management. Red Hat and Nvidia also highlighted security, and said the joint platform streamlines management of traditional infrastructure alongside AI-specific components of the stack.
For model delivery, the companies said customers will have access to pre-configured models, including the IBM Granite family, Nvidia Nemotron and Nvidia Cosmos open models, delivered as Nvidia NIM microservices. Customers can also align models to enterprise data using Nvidia NeMo, they said.
The companies said the serving stack combines components including vLLM, Nvidia TensorRT-LLM and Nvidia Dynamo. Red Hat also highlighted observability functions and said the joint stack is designed to support AI service level objectives.
Red Hat AI 3.3 also adds a capability it described as internal GPU-as-a-Service, with orchestration and pooled hardware access. It includes automatic checkpointing for long-running training jobs, according to Red Hat.
Joe Fernandes, Vice President and General Manager, AI Business Unit, Red Hat, said: "For AI to deliver true business value, it must be operationalised as a core component of the enterprise software stack, not as a standalone silo. Red Hat AI Enterprise is designed to bridge the gap between infrastructure and innovation by providing a unified metal to agent platform. By integrating advanced tuning and agentic capabilities with the industry-leading foundation of Red Hat Enterprise Linux and Red Hat OpenShift, we are providing the complete stack - from the GPU-accelerated hardware to the models and agents that drive business logic. Additionally, with Red Hat AI 3.3 organisations can move beyond fragmented pilots to governed, repeatable and high-performance AI operations across the hybrid cloud."
"The shift from AI experimentation to industrial-scale, enterprise-wide production requires a fundamental change in how we manage the AI computing stack. We're accelerating the path to deploy AI and move quickly to production using Red Hat AI Factory with NVIDIA. With a stable, high-performance foundation driven by our proven hybrid cloud offerings, we're enabling our customers to own their AI strategy and scale with the same rigor they apply to their core IT platforms," said Chris Wright, Chief Technology Officer and Senior Vice President, Global Engineering, Red Hat.
Justin Boitano, Vice President, Enterprise AI Platforms, NVIDIA, said: "Enterprises are building AI factories that turn data into intelligence at scale during inference, requiring production-grade infrastructure and software that span the hybrid cloud. Red Hat AI Factory with NVIDIA provides the software foundation that helps organisations keep pace with rapid infrastructure innovation while reliably building and deploying the next generation of agentic AI applications."