Your inbox, upgraded.
Receive weekly engineering insights to level up your leadership approach.
Estimated reading time: 6 minutes
Now that AI can “think” for itself, New Relic’s head of AI and ML innovation shares their success using AI agents.
Engineering leaders currently face a double-edged sword, tasked with advocating for profound innovations, while carefully framing AI rollouts in a positive light.
“In many ways, the problems with adopting AI have been more of a people problem than a technology problem,” says Camden Swita, Head of AI and ML Innovation at the observability software vendor New Relic. He says success with new AI initiatives hinges on changing hearts and minds rather than mandating usage.
The agentic AI software trend is emerging as the next shift beyond generative AI, enabling AI agents to take on more automated tasks. This approach involves integrating large language models (LLMs) with richer context and real-world tools to retrieve data, conduct multi-step tests, and automate workflows. As a result, new agentic workflows are spreading at New Relic. “I’m starting to see more engineers perk their ears up,” says Swita.
However, success with AI agents involves tracking the right metrics, bridging deterministic and probabilistic AI, getting input from various domain experts, and discovering new optimizations. New Relic is doing all the above and more. And the result? The proof is in the time saved.
Measure success by time saved
Measuring AI’s success in software development is a moving target – some metrics provide more meaningful insights than others. Early into AI, New Relic first tracked raw usage metrics, like the number of developers using AI tools, the amount of times they used AI instead of checking the documentation, total onboarding time, and so on.
However, they’ve now shifted toward metrics that better reflect actual business outcomes. “More and more, we’re really focused on time saved as the key metric,” says Swita. For instance, using an AI agent to generate a series of GraphQL queries might slim down a two-and-a-half-hour process to thirty minutes – a much more profound developer productivity metric to track than simple usage data since it portrays the actual efficiency savings.
Defining AI agent goals gets you better accuracy
The traditional pattern with AI was a single, transactional interaction: one answer supplied for one prompt with limited memory or continuity between prompts. However, recent advancements in retrieval augmentation generation (RAG), frameworks like LangChain and LangGraph, and Microsoft’s continued AI innovations are making AI agents far more sophisticated.
AI agents now have better context, memory, and the ability to chain actions. These advances hinge on recent breakthroughs in agentic AI frameworks, the emergence of vector databases for long-term memory, and granting LLMs easier access to training data within particular domains. Now, LLMs are now better equipped to store and recall data, and perform multi-step actions more effectively.
Part of the updated recipe at New Relic is embracing AI agents to better attain specific goals, like reducing mean-time-to-resolution, automating common developer workflows, and surfacing better observability insights.
New Relic has taken this further by building a compound AI agent to streamline common tasks. Their system wraps an LLM around specific documented workflows, like checking signals after a deployment in a dashboard. Combining deterministic workflows and LLMs in this way grants the best of both worlds – you get better accuracy and user-friendliness, says Swita. Simplified access to reliable power with easy queries has ultimately cut down time wasted on laborious, repeatable tasks.
More like this
Experiment with data types
As New Relic experiments with AI, they’re also reaching unexpected conclusions. One realization is AI’s growing capability to reason based on images. For some use cases, image-to-text models can be more cost-effective and faster than throwing raw text at an LLM, says Swita.
For instance, they’ve found better results using an image-to-text model to understand and navigate the schema of a massive database than solely using text-based models. This aligns with Google’s latest research, which shows how multimodal AI models interpret and combine visual and time-based data to extract insights or make predictions.
Empower AI with data graphs
Making AI agents actually useful requires empowering them with actionable data. Yet, for most enterprises, internal data is typically unstructured and siloed. “A lot of the issues that we’ve struggled with in terms of getting the most value out of these generative AI assistants really link all the way back to enterprise search problems,” says Swita. Since the AI agents can’t quickly find and understand the right data, trials have been hit and miss.
Businesses are starting to turn a corner, though, making data more indexed and accessible using built-in RAG to bring more context across software systems. This is opening up tangible benefits. Compared with a year ago, it’s now realistic to say that using an AI agent did save me time with that complicated task, says Swita.
Ideally, an orchestrator agent (an agent that delegates tasks to other specialized AI agents) would have a structured map of an organization’s internal systems, expressed as graph queries. This would allow it to more intelligently access data and services throughout an organization. “The concept of ‘knowledge graphs‘ is really key here,” says Swita, whose long-term goal is to structure them in a way that allows AI to instantly access information.
Specialized > bloated
There’s a trend in AI where companies want you to use their agent for everything, from code generation to gathering data and performing back-end mutations. While New Relic’s intelligence platform has a part to play in the broader ecosystem of AI agents, such as making recommendations or issuing alerts, it doesn’t need to be the end-all-be-all AI agent, says Swita.
Instead, he views the multi-agent approach as a smarter direction for most cases. For instance, New Relic turns to partners for things like code changes or incident escalation. This can be thought of as a best-of-breed philosophy. “We don’t need to be the center of the universe. We just want to build specialist agents.”
Buddy up with domain experts
Use cases for AI will differ across stakeholders, many of which are skeptical. “When it comes to generative AI, the cat’s out of the bag,” says Swita. To guarantee success, he recommends connecting leaders with subject matter experts to get buy-in and make sure their needs are met. “That’s unlocked quite a bit of success.”
Whether it’s distributed logs and tracing, debugging, or API management, “there is no better source than the people designing the tools,” says Swita. He recommends aligning with domain experts to ensure mutual understanding of goals.
This isn’t to say guidance isn’t necessary, too. While getting IT technical subject matter experts in the loop is beneficial, Swita admits it takes effort to train the business class on the gaps of generative AI. (Most assume LLMs are a magic genie without limitations or faults).
Lead the culture from day one
“Will some roles become redundant? Of course,” says Swita. But is AI here to automate your job away? No. “It’s here to take all the bullcrap tasks you have to do time and time again,” he adds. When AI is positioned this way, folks tend to warm up to it.
Still, skepticism about accuracy and hallucinations abound. When asked what he’d do differently starting this AI journey from scratch, Swita doubles down on quality checks. “I would have insisted and become maniacal about a culture of validation always, testing always, and automated assessment of these responses always.”
While New Relic has since begun allocating resources for assessing AI, he admits there were times when they were “flying somewhat blind.” For greenfield AI projects, ensuring the proper monitoring of AI results from day one is critical. “That DevOps rigor, even when working with LLMs, would have served us well.”
Change is a ‘coming
Overall, agentic AI is still at an early stage, notes Swita, with more junior-level adoption and few senior engineers onboard. It’s taken some high-level engineers time to turn the corner, and some are just “business as usual.”
However, as Swita says: “That will change rapidly very soon.”