Datadog launches domain-specific AI agents & LLM tools

Wed, 11th Jun 2025

FYI, this story is more than a year old

Datadog has announced the addition of three domain-specific AI agents to its generative AI assistant, Bits AI, together with new tools for monitoring and managing large language model (LLM) and agentic AI deployments.

New AI agents

The company has introduced Bits AI SRE, Bits AI Dev Agent, and Bits AI Security Analyst, each configured to serve specific engineering, operations, and security functions. These agents are designed to support real-time incident response, DevOps tasks, and security workflows for development, security, and operations teams.

The AI agents operate on a shared system of core tasks, including data querying, anomaly analysis, and infrastructure scaling. This architecture allows Datadog to roll out new agents efficiently while maintaining consistency in the user experience. The system integrates a broad set of observability data, enabling precise insights and actions for managing risks within cloud-based applications.

Yanbing Li, Chief Product Officer at Datadog, commented on the company's approach:

Datadog is uniquely positioned to deliver value with AI as a platform that has a wealth of clean, rich data—we process trillions of data points and are embedded in our customers' critical engineering, developer and security workflows. With these advancements in AI reasoning and multi-modality, we've gone beyond helping organizations understand their availability, security, performance and reliability. We now enable human-in-the-middle workflows by guiding customers on what to look for and where to start looking, and augment their ability to take action.

Bits AI SRE, which is now in limited availability, acts as an on-call responder for incidents by performing early triage and providing investigation findings before human responders intervene. It allocates incidents, produces real-time summaries, and generates initial post-mortem drafts to save teams time.

Bits AI Dev Agent, currently in preview, identifies code issues, suggests fixes, and can open pull requests directly within the source control management systems organisations use. Bits AI Security Analyst, also in preview, automatically investigates cloud security signals, conducts in-depth threat investigations, and produces actionable resolution recommendations, aiming to reduce response times for security incidents.

Darren Trzynka, Senior Cloud Architect at Thomson Reuters, commented on Bits AI's impact:

At Thomson Reuters, we're focused on maximizing operational efficiency and accelerating innovation at scale through generative AI solutions. Bits AI allows operations and downstream platform teams to receive the full context of the investigation—from the initial monitor trigger to conclusion—driving down resolution time significantly freeing them up to do more.

Additional Applied AI features

The updates include two new features in preview. Proactive App Recommendations analyses telemetry collected by Datadog to suggest performance improvements or actions, such as optimising slow queries and addressing code issues, before users are impacted. The APM Investigator helps engineers troubleshoot latency spikes by automating bottleneck identification and recommending fixes.

LLM Observability suite announced

Datadog has also released a suite of tools designed to provide observability for agentic AI—software agents built with LLMs and similar technologies—in production environments. The new products include AI Agent Monitoring, LLM Experiments, and AI Agents Console.

Yrieix Garnier, Vice President of Product at Datadog, addressed the motivations behind these offerings:

A recent study found only 25 percent of AI initiatives are currently delivering on their promised ROI—a troubling stat given the sheer volume of AI projects companies are pursuing globally. Today's launches aim to help improve that number by providing accountability for companies pushing huge budgets toward AI projects. The addition of AI Agent Monitoring, LLM Experiments and AI Agents Console to our LLM Observability suite gives our customers the tools to understand, optimize and scale their AI investments.

AI Agent Monitoring, now generally available, provides a mapped overview of each agent's decision-making route, including inputs, tool calls, and outputs, displayed in an interactive graph. This enables engineers to diagnose latency spikes or unexpected behaviours and connect them to quality, security, and cost measures across distributed systems.

Mistral AI's Co-founder and CTO, Timothée Lacroix, provided further industry perspective:

Agents represent the evolution beyond chat assistants, unlocking the potential of generative AI. As we equip these agents with more tools, comprehensive observability is essential to confidently transition use cases into production. Our partnership with Datadog ensures teams have the visibility and insights needed to deploy agentic solutions at scale.

LLM Experiments, in preview, enables users to compare the effects of changes to prompts or models using datasets from live or uploaded sources. This aims to support quantifiable improvements in cost, response accuracy, and throughput, and prevent unintended regressions in AI application performance.

Michael Gerstenhaber, Vice President of Product at Anthropic, commented:

AI agents are quickly graduating from concept to production. Applications powered by Claude 4 are already helping teams handle real-world tasks in many domains, from customer support to software development and R&D. As these agents take on more responsibility, observability becomes key to ensuring they behave safely, deliver value, and stay aligned with user and business goals. We're very excited about Datadog's new LLM Observability capabilities that provide the visibility needed to scale these systems with confidence.

Datadog has also introduced AI Agents Console, currently in preview, to allow organisations to centrally oversee both in-house and third-party AI agents, track their usage and impact, and monitor for potential security or compliance issues as external agents are embedded into critical business workflows.

Armita Peymandoust, Senior Vice President, Software Engineering at Salesforce, said:

As enterprises scale digital labour, having clear visibility into how AI agents drive business impact has become mission critical. Customers are already seeing strong success with their AI deployments using Salesforce's Agentforce, which is built on a foundation of openness and trust. That foundation is further strengthened by our partner ecosystem that provides our customers even greater availability to tailored solutions that help them manage their AI agents confidently. Datadog's latest advances in deep observability will further support our vision and unlock another level of AI agent transparency and scale for organizations.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google