Introducing AURA: Building an Open Agentic harness for production AI

In this blog, Mezmo's Henry Andrews discusses why AI agents need production infrastructure, and introduces AURA, Mezmo's open-source agentic harness, offering declarative composition, multi-agent orchestration, and deep observability.

Over the past decade, platform engineering has reshaped how organizations build and operate software.

Instead of managing infrastructure directly, teams created platforms that coordinate complexity behind the scenes. Kubernetes standardized orchestration. CI/CD pipelines automated delivery. Observability platforms made distributed systems understandable.

Each of these innovations introduced a new layer in the stack, one that allowed engineers to move faster because the platform handled coordination underneath.

AI has now entered that stack. And with it comes an opportunity to build the next platform layer.

Large language models have unlocked new ways for software to analyze systems, synthesize insights across telemetry, and automate operational workflows. But moving AI out of the lab and into the cluster exposes a massive gap in maturity. Custom Python scripts wrapping LLM APIs are not production infrastructure. If AI agents are going to investigate incidents, correlate signals, and execute diagnostic queries, they need to run on a harness engineered for reliability, standard interfaces, and multi-step reasoning.

At Mezmo, we believe the next generation of platforms requires purpose-built infrastructure for composing and orchestrating AI. That belief led us to build and open-source AURA.

The System of Context

To support operational workflows, AI systems need more than prompts and models. They need a way to dynamically understand how data, tools, and workflows relate to each other as an investigation unfolds. We refer to this architectural pattern as a System of Context.

A System of Context provides the intelligence layer that allows AI to reason about operational environments tying together telemetry signals, operational runbooks, and infrastructure APIs. When this contextual information is accessible, AI can move beyond answering questions and begin participating meaningfully in operational workflows.

But a concept needs an engine. AURA is the open-source runtime and orchestration harness built to make this concept a reality.

AURA: The engine behind production AI

AURA is an open-source, Rust-based agent harness. It tackles the immediate engineering hurdles that platform teams face when moving AI into real-world environments.

Rather than acting as a black-box service, AURA uses declarative TOML configuration to define complete agent workflows, model provider, system prompts, MCP tools, RAG pipelines, and orchestration topology, in files that can be version-controlled, reviewed, and deployed alongside the rest of your platform.

Here is what AURA actually implements under the hood:

1. Declarative agent composition

Platform engineers shouldn't be writing complex application logic just to wire a prompt to a tool. AURA allows you to define an entire agentic workflow in a single config.toml file. Your AI configurations can now be managed with the same version-control and review workflows as your Kubernetes manifests.

2. Multi-agent orchestration & execution persistence

Simple LLM calls fail at complex tasks. AURA will have a robust multi-agent orchestration module that utilizes a DAG (Directed Acyclic Graph) executor. It supports dependency-aware parallel wave execution, quality evaluation loops, and iterative re-planning.

To ensure this complex reasoning isn't a black box, AURA features Execution Persistence. While workflows are scoped to the lifecycle of the request, AURA writes detailed execution artifacts like plans, prompts, responses, and tool call records, to disk per iteration. This provides deep, post-hoc observability into exactly how your orchestration solved a problem.

3. A Drop-In, OpenAI-Compatible API

Integrating AI into existing internal developer portals or ChatOps UIs is historically painful. AURA solves this by exposing a standard /v1/chat/completions endpoint with SSE (Server-Sent Events) streaming. You can point your existing integrations (like LibreChat or OpenWebUI) directly at AURA.

Behind the scenes, AURA handles multi-provider routing, allowing you to seamlessly swap between OpenAI, Anthropic, AWS Bedrock, Gemini, or local Ollama models simply by updating your TOML configuration.

4. Open tool integration via MCP & seamless interoperability

AI becomes significantly more useful when it can interact with real systems. AURA features deep, first-class integration with the Model Context Protocol (MCP), supporting HTTP streamable, SSE, and STDIO transports.

More importantly, AURA acts as a universal translator between your operational tools and your foundational models. Models like OpenAI that enforce strict schema requirements demand specific formatting for tool capabilities. AURA performs automatic schema sanitization translating standard MCP tool definitions at discovery time into the exact formats these models require.

This eliminates the need to write custom glue code or translation layers. You can plug MCP-compliant tools directly into these LLMs and it simply works. AURA also manages the messy realities of production request lifecycles, safely handling graceful shutdowns, disconnect detection, and MCP cancellation propagation so runaway processes don't consume cluster resources.

5. Deep observability & custom streaming

Standard LLM streaming isn't enough for complex, multi-step workflows. AURA enriches its SSE stream with custom events like aura.tool_requested, aura.tool_complete, and aura.orchestrator.plan_created. This allows your front-end applications to build rich UIs that show users exactly what the agent is thinking and doing in real time.

Furthermore, AURA ships with native OpenTelemetry integration (via an otel feature flag) and an OpenInference exporter. It automatically generates rich spans for agent streams, LLM calls, and tool executions, allowing you to trace AURA’s decisions in tools like Arize Phoenix or your existing APM.

AURA in action: Incident response

To understand how this comes together, imagine a Kubernetes CPU spike alert triggering an investigation workflow in your developer portal.

Instead of routing to a basic LLM, the request hits AURA's OpenAI-compatible endpoint. AURA’s DAG orchestrator takes over. It delegates to a "Metrics Worker" (assuming an MCP server exists to connect it to Datadog) and a "Logs Worker" (using an MCP-connected Elastic instance).

The workers execute in parallel, querying the systems and returning their findings. An evaluation loop verifies the context, realizes it needs to check recent deployments, and dynamically re-plans to query GitHub. Finally, it synthesizes the root cause and streams the result to your defined destinations (PagerDuty, Slack, etc) with every step, reasoning event, and tool call fully traced via OpenTelemetry.

Building the next platform layer

AI is quickly becoming part of the modern software stack. As that happens, platform teams have an opportunity to shape how these systems operate. By deploying a harness that explicitly manages standard APIs, declarative configuration, and advanced orchestration, you can transform AI from an experimental prototype into a dependable platform service.

AURA represents our first step toward that vision. By open-sourcing, we hope to provide a reliable, well-engineered foundation that the platform engineering community can adopt, stress-test, and evolve.

It is time to move past isolated scripts and fragile API wrappers. If AI agents are going to become real participants in our systems, they need a harness designed with production realities in mind.

Next news
You're viewing our latest news item.
Previous news
You're viewing our oldest news item.
Making Data Agent Ready - an interview with Software Huddle
Introducing AURA: Building an Open Agentic harness for production AI
Why We Need an Open Source System of Context in the AI Era
Agentic AI Foundation Welcomes 97 New Members As Demand for Open, Collaborative Agent Standardization Increases
10 Software Companies to Watch in 2026
How Mezmo Cuts AI Observability Costs by 90% With Context Engineering | Tucker Callaway
Code Story: Insights from Startup Tech Leaders with Mezmo's Tucker Callaway
2026 Observability Predictions - Part 3
2026 Observability Predictions - Part 1
How AI-Driven Observability Is Transforming SRE: Insights from Mezmo CEO Tucker Callaway
2026: The End of the Dashboard as We Know It?
The Importance of Context Engineering in the AI Era
Mezmo: Named One of The Top 50 Software Companies of 2025
Why Synthetic Tracing Delivers Better Data, Not Just More Data
Why Agentic SREs Require Active Telemetry in Kubernetes
5 Startups Defining AI SRE
Mezmo Launches AI SRE Agent for Root Cause Analysis
AI-Driven Observability with Tucker Callaway | The Software With Podcast
Mezmo CEO Tucker Callaway on Active Telemetry, Context Engineering, and the Fastest AI SRE for Kubernetes | 10KMedia Podcast
Mezmo Launches Fast & Precise AI SRE for Kubernetes Ahead of KubeCon
Mezmo Wins 2025 Digital Innovator Award from Intellyx
Mezmo Announces Cost Optimization Workflow to Reduce Observability Spend for Datadog Users
Mezmo Disrupts Market by Reducing Observability Cost Structure by 90%
Building trust in telemetry data [Q&A]
2025 Observability Predictions - Part 1
Mezmo Simplifies Management of Telemetry Data to Reduce Observability Costs
At KubeCon/CloudNativeCon 2024, AI hype gives way to real application concerns
Mezmo Unveils Mezmo Flow for Guided Data Onboarding and One-Click Log Volume Optimization
Mezmo Flow Released
What’s new from KubeCon + Cloud Native Con North America 2024
Mezmo Unveils Mezmo Flow for Guided Data Onboarding and One-Click Log Volume Optimization - Yahoo Finance
Real-time Analytics News for the Week Ending November 16
Analytics and Data Science News for the Week of November 15; Updates from Alteryx, DataRobot, ThoughtSpot & More
Modern Observability Through Application Development
Mezmo Unveils Mezmo Flow for Guided Data Onboarding and One-Click Log Volume Optimization
Mezmo CEO Tucker Callaway Shares Observability Insights and KubeCon + CloudNativeCon 2024 Plans
Telemetry Data: The Puzzle Pieces of Observability
Q&A with Tucker Callaway, CEO of Mezmo
Mezmo Makes Inc. 5000’s List of Fastest Growing Companies in the Nation for Third Consecutive Year
7 Ways Telemetry Pipelines Unlock Data Confidence
The 2024 SD Times 100: 'Best in Show' in Software Development
Mezmo Hires Former StackHawk, New Relic Leader as Vice President of Product
Inside the VP of Sales' Journey: Financial Software to AI Startups - Craig McAndrews Spills it all!
Mezmo: Adding In-Stream Alert Capabilities to Telemetry Pipeline Platform
An IT Manager's (Re)View of the RSA Conference
Real-time Analytics News for the Week Ending May 11
Mezmo Adds Industry-First Stateful Processing in Telemetry Pipelines
SalesTechStar Interview with Craig McAndrews, Vice President of Sales at Mezmo
Mezmo Ranks No. 82 on Inc. Magazine’s List of the Pacific Region’s Fastest-Growing Private Companies
How To Break Down Silos To Get More Benefit From Your Data
Mezmo Bolsters Sales Leadership With New Hires From Chef and Apptio
How Metric Normalization Enhances Data Observability
KubeCon 2023: Telemetry and Data Management
Telemetry Data’s Role in Cybersecurity – Tucker Callaway – Enterprise Security Weekly
Breaking data silos between observability and security empowers organizations
2024 Application Performance Management Predictions - Part 3: Observability
Data Management News for the Week of November 10; Updates from AWS, Monte Carlo, Satori & More
Real-time Analytics News for the Week Ending November 11
At KubeCon NA 2023, finding cloud independence on the edges of Kubernetes
Mezmo Introduces Data Profiling and Responsive Telemetry Pipelines for Kubernetes
Data Profiling & Responsive Telemetry Pipelines For Kubernetes | Mezmo
KubeCon: GKE Enterprise gets release date, Mezmo adds data profiling feature, and more
Data Profiling & Responsive Telemetry Pipelines For Kubernetes | Mezmo
Data Profiling & Responsive Telemetry Pipelines For Kubernetes | Mezmo
Optimize Your Observability Spending in 5 Steps
Take Control of Your Kubernetes Telemetry Data
The Role of Observability Engineers in Managing Complex IT Systems
Mezmo Launches Welcome Pipeline to Unlock Kubernetes Insights Faster
Mezmo Ranks #1,386 on Inc. 5000’s List of Fastest Growing Companies in the Nation
Mezmo Simplifies Management of DevOps Telemetry Data
Mezmo Empowers Enterprises to Extract Business Insights from Telemetry Data
How DevOps Teams Can Manage Telemetry Data Complexity
Mezmo Wins the 2023 Digital Innovator Award from Intellyx
Tucker Callaway, Mezmo | RSA Conference 2023
Mezmo: Cloud Native Telemetry Pipeline
Mezmo Adds Free Community Plan for Managing Observability Data
Mezmo Announces Free Access to Telemetry Pipeline
Tame Telemetry Data With Mezmo Observability Pipeline
Mezmo Named 2023 Log Analytics Solution of the Year In Data Breakthrough Awards
Down the Observability Pipeline with Mezmo
How Developers, SRE Teams, and Security Engineers Use Telemetry Data
Data Pipeline Feeds IT's Observability Beast
How to Maximize Telemetry Data Value With Observability Pipelines
Mezmo Ranks #53 on Inc. Magazine’s List of Fastest-Growing Companies in the Pacific Region
Mezmo 2023 Predictions: More Organizations Adopt OpenTelemetry
Understanding Observability Data's Impact Across an Organization
Solutions Review Names 6 Data Observability Vendors to Watch, 2023
DevSecOps Accelerates Incident Detection, Response Efforts
2023 Application Performance Management Predictions - Part 3
Mezmo-Harris Poll Report Explores the Impact of Observability Data
Mezmo Wins Intellyx 2022 Digital Innovator Award
Mezmo Ranked No. 164 on Deloitte Technology Fast 500
Mezmo Wins 2022 Reworked IMPACT Award
Mezmo Unveils Observability Pipeline to Enhance the Value of Data
Launching a podcast? Try these 14 tips for greater exposure
DevSecOps Expedites Incident Detection and Response Time
Mezmo Named A Fastest Growing Company On Inc. 5000
DevSecOps Adoption Lags Despite Incident Detection Impact
Implementing DevSecOps Means Fewer Incidents
DevSecOps Reduces Security Incidents Research Finds