NVIDIA launches AI Foundry & NIM microservices for enterprises

Thu, 25th Jul 2024

FYI, this story is more than a year old

NVIDIA has announced the launch of an AI service dubbed NVIDIA AI Foundry and NVIDIA NIM inference microservices, designed to enhance generative AI applications for enterprises worldwide. This initiative features the newly introduced Llama 3.1 collection of models.

NVIDIA AI Foundry provides a platform for enterprises and national bodies to create custom "supermodels" tailored for industry-specific applications using the Llama 3.1 models in conjunction with NVIDIA's computational resources and expertise. These custom models can be trained with both proprietary data and synthetic data generated from Llama 3.1 405B and the NVIDIA Nemotron Reward model.

The service is powered by the NVIDIA DGX Cloud AI platform, co-engineered with major public cloud providers. This collaboration aims to equip enterprises with scalable computational resources to meet evolving AI demands. The new service is introduced at a time when enterprises and nations are exploring AI strategies to create domain-specific language models for generative AI tools reflective of their unique environments.

"Meta's openly available Llama 3.1 models mark a pivotal moment for the adoption of generative AI within the world's enterprises," said Jensen Huang, founder and CEO of NVIDIA. "Llama 3.1 opens the floodgates for every enterprise and industry to build state-of-the-art generative AI applications. NVIDIA AI Foundry has integrated Llama 3.1 throughout and is ready to help enterprises build and deploy custom Llama supermodels."

Echoing this sentiment, Mark Zuckerberg, founder and CEO of Meta, stated, "The new Llama 3.1 models are a super-important step for open source AI. With NVIDIA AI Foundry, companies can easily create and customise the state-of-the-art AI services people want and deploy them with NVIDIA NIM. I'm excited to get this in people's hands."

NVIDIA has also made available their NIM inference microservices for the Llama 3.1 models through their website. These microservices are designed to deliver up to 2.5 times higher throughput when running inference compared to alternatives. Enterprises can use these microservices alongside new NVIDIA NeMo Retriever NIM microservices to develop advanced retrieval pipelines for AI-driven applications such as digital human avatars and AI assistants.

Accenture has been announced as the first adopter of the NVIDIA AI Foundry. Using their Accenture AI Refinery framework, the consulting firm will create custom Llama 3.1 models for both internal purposes and for clients. Julie Sweet, chair and CEO of Accenture, noted, "The world's leading enterprises see how generative AI is transforming every industry and are eager to deploy applications powered by custom models. Accenture has been working with NVIDIA NIM inference microservices for our internal AI applications, and now, using NVIDIA AI Foundry, we can help clients quickly create and deploy custom Llama 3.1 models to power transformative AI applications for their own business priorities."

NVIDIA AI Foundry provides an end-to-end service combining NVIDIA software, infrastructure, and expertise. This service also integrates open community models and technology from the NVIDIA AI ecosystem. Enterprises can develop custom models using the NVIDIA NeMo platform, including the NVIDIA Nemotron-4 340B Reward model, which holds the top ranking on Hugging Face RewardBench.

The process allows enterprises to generate synthetic data using Llama 3.1 405B and Nemotron-4 340B to enhance model accuracy. Additionally, companies with their own training data can use NVIDIA NeMo for domain-adaptive pretraining to further refine model effectiveness.

The collaboration between NVIDIA and Meta extends to offering distillation recipes for Llama 3.1, enabling developers to build smaller, customised models for a wider array of AI infrastructure deployments, including workstations and laptops powered by NVIDIA RTX and GeForce RTX GPUs.

Several industry leaders, including Aramco, AT&T, and Uber, are amongst the initial users of the new NIM microservices for Llama 3.1. These companies operate in diverse sectors, such as healthcare, energy, financial services, retail, transportation, and telecommunications.

NVIDIA has emphasised that production support for these microservices is available through NVIDIA AI Enterprise, with free access for research, development, and testing soon to follow for members of the NVIDIA Developer Program.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google