Enterprise ai deployment server room cloud neural network integration

Red Hat AI 3 aims to streamline enterprise AI at production scale

Wed, 22nd Oct 2025

Red Hat has announced Red Hat AI 3, an updated version of its enterprise artificial intelligence platform, aimed at streamlining the development and deployment of AI workloads at scale for organisations.

The new platform integrates Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI) and Red Hat OpenShift AI, and introduces features designed to help companies move AI projects from proof-of-concept to production more efficiently. Key focuses include agentic AI systems, support for a range of hardware, and improved collaboration around AI-enabled applications.

Addressing enterprise AI challenges

Red Hat AI 3 targets several hurdles that enterprises often encounter when scaling AI initiatives, such as managing costs, addressing data privacy concerns and handling diverse AI models. According to the "GenAI Divide: State of AI in Business" report by the Massachusetts Institute of Technology NANDA project, approximately 95% of organisations are not seeing measurable financial returns despite around USD $40 billion in enterprise spending on AI. Red Hat believes its common platform can help CIOs and IT leaders achieve more consistency and value from accelerated computing technologies in complex, hybrid environments.

"As enterprises scale AI from experimentation to production, they face a new wave of complexity, cost and control challenges. With Red Hat AI 3, we are providing an enterprise-grade, open source platform that minimises these hurdles. By bringing new capabilities like distributed inference with llm-d and a foundation for agentic AI, we are enabling IT teams to more confidently operationalise next-generation AI, on their own terms, across any infrastructure," said Joe Fernandes, Vice President and General Manager, AI Business Unit at Red Hat.

Emphasising inference and scalability

With enterprises increasingly moving AI into production settings, Red Hat is shifting focus from just model training to inference, aiming to deliver scalable and cost-effective performance. Red Hat AI 3 incorporates performance gains from community projects such as vLLM and llm-d, with Red Hat's own model optimisation capabilities, to enhance large language model (LLM) serving in production environments.

The Red Hat OpenShift AI 3.0 release makes llm-d generally available, enabling what the company describes as intelligent distributed inference on Kubernetes. With technologies including the Kubernetes Gateway API Inference Extension, the NVIDIA Dynamo low latency data transfer library (NIXL), and the DeepEP Mixture of Experts (MoE) communication library, the platform is set up to help customers lower costs and accelerate response times, streamline model deployment at scale, and provide flexible support across different hardware platforms, including NVIDIA and AMD accelerators.

Ujval Kapasi, Vice President, Engineering AI Frameworks at NVIDIA, commented, "Scalable, high-performance inference is key to the next wave of generative and agentic AI. With built-in support for accelerated inference with open source NVIDIA Dynamo and NIXL technologies, Red Hat AI 3 provides a unified platform that empowers teams to move swiftly from experimentation to running advanced AI workloads and agents at scale."

Dan McNamara, Senior Vice President and General Manager, Server and Enterprise AI at AMD, stated, "As Red Hat brings distributed AI inference into production, AMD is proud to provide the high-performance foundation behind it. Together, we've integrated the efficiency of AMD EPYC processors, the scalability of AMD Instinct GPUs, and the openness of the AMD ROCm software stack to help enterprises move beyond experimentation and operationalise next-generation AI - turning performance and scalability into real business impact across on-prem, cloud, and edge environments."

Collaboration and unified experience

Red Hat AI 3 introduces capabilities meant to support collaborative AI initiatives and platform unification. Its Model as a Service (MaaS) component allows IT teams to centrally provide, manage, and give on-demand access to models for AI developers and applications. The AI hub offers a curated model catalogue, lifecycle management features, and deployment configuration tools for OpenShift AI users. Gen AI studio supplies an environment for interactive prototyping, discovery, and prompt tuning, with pre-validated and optimised open source models such as OpenAI's gpt-oss, DeepSeek-R1, Whisper for speech-to-text, and Voxtral Mini for voice agents available for rapid deployment.

The company positions these features as supporting both platform engineers and AI engineers to execute AI strategies on a consistent foundation. By including a toolkit for model customisation, based on InstructLab and open source libraries like Docling, Red Hat AI 3 aims to facilitate tasks such as the ingestion of unstructured documents, synthetic data creation, fine tuning and evaluation of models, and monitoring results for business accuracy.

Operationalising agentic AI

The release highlights preparation for next-generation AI agents, with features supporting complex, autonomous application workflows. New additions include a Unified API layer based on Llama Stack, ensuring compatibility with industry protocols, and early support for the Model Context Protocol (MCP), designed to streamline model interactions with external tools.

Mariano Greco, Chief Executive Officer at ARSAT, said, "As a provider of connectivity infrastructure for Argentina, ARSAT handles massive volumes of customer interactions and sensitive data. We needed a solution that would move us beyond simple automation to 'Augmented Intelligence' while delivering absolute data sovereignty for our customers. By building our agentic AI platform on Red Hat OpenShift AI, we went from identifying the need to live production in just 45 days. Red Hat OpenShift AI has not only helped us improve our service and reduce the time engineers spend on support issues, but also freed them up to focus on innovation and new developments."

Rick Villars, Group Vice President, Worldwide Research, IDC, observed, "2026 will mark an inflection point as enterprises shift from starting their AI pivot to demanding more measurable and repeatable business outcomes from investments. While initial projects focused on training and testing models, the real value - and the real challenge - is to operationalise model-derived insights with efficient, secure and cost-effective inference. This shift requires more modern infrastructure, data, and app deployment environments with ready to use production-grade inference capabilities that can handle real-world scale and complexity, especially as agentic AI supercharges inference loads. Companies that succeed in becoming AI-fueled businesses will be those who establish a unified platform to orchestrate these ever more sophisticated workloads in hybrid cloud environments, not just in silo domains."

Red Hat AI 3 is positioned to support organisations at various stages of their AI journey, prioritising inference scalability, cross-team collaboration, agentic workflows, and support for diverse infrastructure, including both AMD and NVIDIA hardware platforms.

Preferred Source

Red Hat AI 3 aims to streamline enterprise AI at production scale

Top stories