The intelligence layer for production AI

Mezmo's Active Telemetry reduces millions of raw events into curated, context-rich signals. AURA, the open-source control plane on your infrastructure, orchestrates agents that get smarter with every incident. Together, they give your AI the right data and the framework to act on it.

Where do you want to start?

Pick your entry point

Pick the one that matches where you are. We have something for you at each step of the journey.
AURA

Single Agent

Pick a use case (incident triage, runbook RCA, or on-call assistant). Wire it up with a TOML config. Ship your first production agent in under an hour.

  • OpenAI-compatible with streaming SSE: Point LibreChat, OpenWebUI, or any existing frontend at it—zero adapter code.
  • LLM agnostic: OpenAI, Anthropic, Bedrock, Gemini, Ollama, etc.
  • MCP tool discovery at runtime: Datadog, PagerDuty, Slack, internal APIs—dynamic discovery, no code changes.
  • Pre-built agentic SRE workflows grounded in your runbooks: Triage agent fires first, passes curated context to RCA agent, remediation agent acts on confirmed root cause.

< 1 hr to running an agent

5 LLM providers

0 boilerplate

[llm]
provider = "anthropic"
api_key = "{{ env.ANTHROPIC_API_KEY }}"
model    = "claude-opus-4-6"

[agent]
name          = "Ops Assistant"
system_prompt = "You're an SRE assistant"
turn_depth    = 3

[mcp.servers.clickhouse]
transport = "http_streamable"
url       = "http://clickhouse-mcp:8000/mcp"
[mcp.servers.clickhouse.headers]
Authorization = "Bearer {{ env.MCP_TOKEN }}"

# Optional: Connect to Mezmo's MCP Server
[mcp.servers.mezmo]
transport = "http_streamable"
url       = "https://mcp.mezmo.com/mcp"
[mcp.servers.mezmo.headers]
Authorization = "Bearer {{ env.MEZMO_API_KEY }}"
AURA

Agent Team

One agent handled one job. Now coordinate a team of specialized agents to triage, investigate, and remediate with an orchestrator managing handoffs.

  • Multi-agent orchestration: Specialized workers coordinated by an orchestrator agent for complex, multi-step investigations.
  • Safety controls: turn_depth, streaming timeouts, graceful shutdown, backpressure. Human-in-the-loop approval gates before any remediation action.
  • OpenTelemetry + OpenInference tracing: Full audit trail across every agent—plans, prompts, tool calls, handoffs. Egresses to Arize Phoenix, Jaeger, Datadog, Mezmo.

15 → 5 min MTTR

60-80% toil eliminated

4 hrs → auto post mortem

# Orchestrator routes to specialist agents
[llm]
provider = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
model = "gpt-5.2"


[[vector_stores]]
name = "runbooks"
type = "qdrant"
url = "http://{{ env.QDRANT_HOST | default: 'localhost' }}:6334"
collection_name = "sre_runbooks"
context_prefix = "Operational runbooks covering incident response procedures, known failure modes, and troubleshooting guides"
embedding_model = { provider = "openai", model = "text-embedding-3-small", api_key = "{{ env.OPENAI_API_KEY }}" }


[agent]
name = "SRE Orchestrator"
system_prompt = """
You are an SRE Orchestrator. Decompose incident response tasks and delegate:
- incident-responder: PagerDuty incident lookup, alert details, oncall schedules
- metrics-analyst: Prometheus queries to validate alerts and check trends
- log-analyst: Log search, error patterns, timeline correlation


Maximize parallel execution when tasks have no data dependency.
"""
turn_depth = 15
temperature = 0.3


[mcp]
sanitize_schemas = true


[mcp.servers.pagerduty]
transport = "http_streamable"
url = "https://mcp.pagerduty.com/mcp"
headers = { Authorization = "Token token={{ env.PAGERDUTY_API_KEY }}" }
description = "PagerDuty MCP for incident details, oncall schedules, and alert status"


[mcp.servers.prometheus]
transport = "http_streamable"
url = "http://{{ env.PROMETHEUS_MCP_HOST | default: 'localhost' }}:8080/mcp"
description = "Prometheus MCP for querying system metrics"


[mcp.servers.log_analysis]
transport = "http_streamable"
url = "https://mcp.mezmo.com/mcp"
description = "Log analysis MCP for searching and correlating log events"


[orchestration]
enabled = true


[orchestration.worker.incident-responder]
description = "PagerDuty incident triage: fetch incident details, parse alerts, check oncall schedules"
turn_depth = 8
mcp_filter = [
  "list_incidents",
  "get_incident",
  "list_alerts_from_incident",
  "get_alert_from_incident",
  "list_services",
  "get_service",
  "get_current_time",
]
preamble = """
You are an Incident Responder. Use PagerDuty tools to fetch and parse incidents.
Extract: environment, alert category, severity, timestamp, metric value, RunBook URL, and triggering query.
Always use tools — do not fabricate incident data.
"""


[orchestration.worker.metrics-analyst]
description = "Prometheus metrics analysis: validate alerts, check trends, identify anomalies"
turn_depth = 20
mcp_filter = [
  "execute_query",
  "execute_range_query",
  "list_metrics",
  "get_current_time",
]
preamble = """
You are a Metrics Analyst. Query Prometheus to validate alerts, check trends, and identify anomalies.
Always get current time before range queries. Do not fabricate metric values.
Report query results clearly with metric names, labels, and values.
"""


[orchestration.worker.log-analyst]
description = "Log analysis: search logs, analyze error patterns, correlate events across time"
turn_depth = 20
vector_stores = ["runbooks"]
mcp_filter = [
  "analyze_logs_*",
  "deduplicate_logs_*",
  "get_correlated_timeline_*",
  "get_current_time",
  "get_log_histogram",
  "list_log_fields",
]
preamble = """
You are a Log Analyst. Search and analyze logs for operational investigations.
Search runbooks for known failure patterns when errors match documented scenarios.
Report findings with timestamps, error messages, and relevant context.
"""
Mezmo

Engineered Context

Already using LangChain, CrewAI, or your own framework? The bottleneck is the data going in. Mezmo is the context layer that makes any agent smarter.

  • Active Telemetry Pipeline: Deduplicate, cluster, enrich before agents see data. Up to 99.98% compression—every removed token saves inference cost.
  • Agent-optimized MCP server: Returns curated, task-scoped data—not raw firehose.
  • Just-in-time context delivery: Each workflow step gets precisely scoped data. Dynamic assembly as investigations unfold—not a dump of everything.

~$1 per investigation

99.98% data reduction

50-70% more efficient

AURA + Mezmo MCP (curated context)

[mcp.servers.mezmo]
transport = "http_streamable"
url       = "https://mcp.mezmo.com/mcp"
headers   = {
  "Auth" = "Bearer {{ env.MEZMO_TOKEN }}"
}

# No local MCP server to run.
# Mezmo returns pipeline-processed signals,
# not raw API firehose.

# WITHOUT Mezmo (raw vendor MCP)
# → 2.4M tokens per investigation
# → 88% noise in context window
# → $30-36 per investigation
# → 14+ min MTTR

# WITH Mezmo pipeline + MCP
# → <1K curated signals
# → noise removed before agent sees it
# → <$1 per investigation
# → <5 min MTTR
Mezmo

Control your data

Many teams start here. OTel migration, cost reduction, vendor consolidation. Get your data under control first, then layer agents on top when you're ready.

  • Flexible telemetry routing: Ingest with OTel and route to Mezmo, Datadog, Grafana, Elastic, or S3. Migrate between destinations slowly or all at once.
  • Cost profiling: Identify high-volume, low-value streams. Cut observability spend up to 70%.
  • Proactive anomaly detection: Continuous monitoring for degraded signals and drift. Surface issues before they become incidents.

Up to 70% cost reduction

0 vendor lock-in

Proactive not reactive

Examples

DevOps assistant
GitHub

Reviews PRs, explores repos, and manages code workflows.

Incident response agent
PagerDuty + Datadog

Triages alerts, pulls metrics, and correlates monitoring data.

Kubernetes SRE agent
K8s cluster operations + monitoring

Inspects workloads, queries metrics, and assists with cluster troubleshooting.

The platform

Mezmo as the brain, AURA as the hands.

AI agents are only as good as the data they reason on. Mezmo makes that data clean, structured, and ready. AURA turns it into action.
Mezmo is the data intelligence layer

Ingests, profiles, and understands telemetry in real-time with the ability to modify and alert in stream with pipelines.

  • Easy to get started with over 100 integrations
  • In-stream parsing and enrichment with intent-based direction
  • One-click oTel migration
AURA is the orchestration layer

Open-source agentic harness that orchestrates AI workflows across your stack. Forever open source & production ready.

  • MCP-native tool connectivity, LLM agnostic
  • Self-correcting through: plan → execute → synthesize → evaluate
  • Custom agentic runbooks

The right data for your agents. Faster resolution for your team.

From millions of signals to one root cause. Your agents are only as good as their data. Mezmo and AURA handle both.

Explore more

Press
Why we need an open source system of context in the AI era

Press
Mezmo joins Agentic AI Foundation as a new member

Blog
AURA in Practice: Real-world use cases for production AI Agent infrastructure
eBook
Context Engineering for Observability - O'Reilly Report