Semantic Conventions for Agent Ready Active Telemetry
Why semantic conventions matter in the AI era
Semantic conventions matter more in the AI era because AI systems don’t just “visualize” telemetry: they reason over it, join it, and increasingly act on it. If your telemetry isn’t standardized, you don’t just get messy dashboards… you get unreliable AI.
From raw telemetry to agent ready context
In traditional observability, telemetry is mostly used by humans:
- search logs
- inspect traces
- review metrics
- interpret symptoms
In the AI era, telemetry becomes machine-consumable context. That changes everything.
What “agent-ready context” actually means
Agents need telemetry to be:
- Structured (consistent keys + value types)
- Predictable (same meaning across systems)
- Joinable (attributes align across logs/traces/metrics)
- Complete enough to act (who/what/where/impact/risk)
Semantic conventions are what turn telemetry into a stable contract — i.e., a shared language that agents can use to:
- detect anomalies
- correlate signals across tools
- explain root cause
- recommend actions (or execute safe automations)
Without conventions, an agent spends its tokens and time doing translation:
- “is svc, service, service_name, and appName the same thing?”
- “is env=prod equivalent to environment=production?”
- “is host, hostname, node, instance describing the same resource?”
That’s not intelligence. That’s data cleaning.
The hidden tax of inconsistent attribute names
Most teams underestimate this tax because it’s spread across engineering time, tool spend, and operational inefficiency.
The “inconsistency tax” shows up everywhere:
1) Correlation failure
- logs have requestId
- traces use trace_id
- metrics use nothing
Result: correlation breaks → humans do manual joining.
2) AI accuracy degradation
AI depends on patterns. If attributes are inconsistent, AI models see multiple partial truths rather than one coherent dataset. That yields:
- false correlations
- missed incidents
- hallucinated or shallow RCA
3) Pipeline waste & cost inflation
Inconsistent names create accidental high-cardinality explosions:
- userId, userid, user_id become separate fields
- k8s.cluster.name vs cluster duplicates dimensions
- dashboards and queries multiply
4) Query complexity and brittleness
Instead of:
service.name="checkout"
you get:
(service="checkout" OR svc="checkout" OR serviceName="checkout" OR app="checkout")
That creates fragile detection logic and alert rules that silently miss real failures.
5) Governance and compliance risk
If “PII-ish fields” aren’t standardized, you can’t reliably:
- detect sensitive data
- redact it consistently
- enforce access controls
So yes — attribute inconsistency becomes a hidden operational liability.
What standardisation unlocks for correlation, RCA, and cost control
Semantic standardization isn’t “nice to have.” It’s the multiplier that makes modern observability and AI-native operations possible.
A) Correlation that actually works (cross-signal + cross-tool)
With shared conventions:
- logs, traces, and metrics share consistent resource identity
- correlation is deterministic, not probabilistic
This enables:
- trace ↔ log pivoting without heuristics
- service map accuracy
- real dependency analysis (not guesswork)
B) Faster, more reliable RCA
When you standardize, your telemetry supports “explainability”:
- every event can be grounded to service + deployment + infra + request context
- errors can be grouped correctly
- blast radius can be calculated quickly
Meaning:
- fewer war rooms
- less “grep archaeology”
- more automatic root cause narratives that hold up under scrutiny
C) Cost control that doesn’t degrade insight
Standardization is a cost lever because it enables policy-based routing and reduction, safely.
When your attributes are consistent, you can implement rules like:
- route only log.level>=warn to hot storage
- keep full fidelity traces for payment service
- sample aggressively for low-risk endpoints
- dedupe known noisy sources
- quarantine verbose debug from specific deployments
Without conventions, those rules become unreliable and dangerous.
D) More powerful AI + agent workflows
This is the biggest unlock: semantic conventions are the bridge from observability to autonomy.
Standardization enables:
- “incident context bundles” (a clean package of signals)
- agent tool use (querying the right systems)
- runbook selection based on consistent labels
- automated remediation with confidence boundaries
In other words semantic conventions turn telemetry into a control system, not just a visibility system.
In the AI era:
- Telemetry is no longer just for humans.
- Telemetry is context for machines.
- Machines require consistency to reason correctly.
So semantic conventions matter because they convert:
raw telemetry → reliable context → correlation & RCA → controlled cost → safe automation
What OpenTelemetry semantic conventions are
OpenTelemetry semantic conventions are the shared naming rules that make telemetry understandable and usable everywhere, not just inside one tool.
They define what you should call things (attributes, span names, metric names), how you should format them, and what units to use—so that a trace/log/metric produced by one team or vendor can be correctly interpreted by another.
OpenTelemetry semantic conventions are standardized patterns for describing telemetry data consistently across:
- Traces (spans)
- Metrics
- Logs
- Resources (the “thing producing telemetry”: service, pod, host, cloud resource)
- Events attached to spans/logs
Think of them as:
A vendor-neutral “dictionary” for telemetry fields and measurements.
This matters because without conventions, every team invents its own naming:
- service, svc, service_name, appName
- latency_ms, duration, response_time, elapsed
Semantic conventions reduce that chaos by giving a recommended canonical format.
Attributes, span names, metric names, and units
A) Attributes
Attributes are key-value pairs that provide context.
Semantic conventions standardize attribute names so tools and teams agree what a field means.
Examples:
- service.name (resource attribute)
- deployment.environment.name
- http.request.method
- http.response.status_code
- db.system, db.operation.name
- exception.type, exception.message
Why it matters:
- Enables reliable filtering, grouping, and joining across signals
- Makes correlation consistent (trace ↔ logs ↔ metrics)
B) Span names
Span names describe what the operation is.
OpenTelemetry conventions recommend:
- stable, low-cardinality names
- use operation-style names, not dynamic values
Examples:
Good:
- GET /checkout
- POST /payments
- SELECT orders
Bad (high-cardinality):
- GET /checkout?user=1234
- fetch order 981273
Why it matters:
- Span names drive aggregation, dashboards, and anomaly detection
- High-cardinality names destroy signal quality and cost efficiency
C) Metric names
Metric naming conventions define consistent, descriptive, portable metric names.
You’ll typically see:
- dot-separated names
- clear domain prefixes
- consistent suffix patterns
Examples:
- http.server.request.duration
- rpc.client.duration
- system.cpu.utilization
- db.client.connections.usage
Why it matters:
- Lets tools auto-detect metric meaning
- Improves out-of-the-box dashboards and SLOs
- Makes metrics portable across platforms
D) Units
Units are critical because AI systems and humans can’t safely compare measurements without them.
Semantic conventions standardize:
- the unit
- the type of measurement
Examples:
- request duration in seconds (s)
- payload size in bytes (By)
- CPU in 1 (ratio)
- memory in bytes
Why it matters:
- Prevents errors like mixing ms vs s, MB vs MiB
- Enables cross-team comparisons and consistent alerting
Vendor neutral interoperability across tools and teams
This is the core promise of semantic conventions.
Without semantic conventions
Even if your telemetry is OpenTelemetry formatted, it may still be inconsistent:
- Team A uses env
- Team B uses environment
- Vendor C expects deployment.environment
Result:
- dashboards don’t work universally
- correlation breaks
- tool migrations become expensive
- AI models struggle to generalize across data sources
With semantic conventions
You get:
- portable dashboards
- consistent correlation keys
- shared runbooks
- standardized SLO inputs
- smoother interoperability between:
- collectors
- pipelines
- storage
- analytics tools
- alerting platforms
In practice this means semantic conventions are what allow:
“instrument once, analyze anywhere”
Semantic conventions vs schemas and why both exist
This is a super important distinction.
Semantic conventions = “meaning & naming rules”
They define:
- canonical attribute names
- span naming guidance
- metric naming and units
- recommended dimensions and event formats
Goal: shared language + consistent meaning.
Schemas = “versioned change management”
In OpenTelemetry, a schema is used to track and manage how conventions evolve.
Why schemas exist:
Semantic conventions change over time:
- attribute renamed
- meaning refined
- metric definition updated
- semantic group reorganized
A schema provides a versioned mapping so systems can:
- transform old telemetry into newer conventions
- keep data interpretable across versions
- support compatibility without breaking analysis
So:
Simple analogy
- Semantic conventions = the dictionary
- Schemas = the dictionary edition + translation guide
Why this matters in the AI era
Agents and LLM-based tooling depend on clean, consistent semantics.
Semantic conventions help AI:
- correctly join signals across traces/logs/metrics
- avoid misinterpretation of attributes
- generalize across teams and environments
- automate safely (because meaning is stable)
Without conventions, AI spends effort on guessing what fields mean.
Stability, versions, and migration strategy
In agent-ready active telemetry, “stability, versions, and migration” is basically: how you evolve semantic conventions without breaking correlation, automations, or the agents that depend on consistent context.
OpenTelemetry tackles this with stability levels, versioned semantic convention releases, and explicit migration patterns (often opt-in + duplication) so production systems can roll forward safely.
Stable vs experimental vs deprecated conventions
Stable conventions
- Promise: names/meanings won’t change in a breaking way (backward compatibility expectations).
- What it enables: you can build durable detections, dashboards, RCA automations, and agent tools on top of them with confidence.
- Example area: the stabilized HTTP & networking conventions (with a defined migration plan because changes were breaking).
Experimental conventions
- Promise: useful, but still evolving—breaking changes are possible.
- Operational impact: if agents learn or your runbooks depend on these fields, you need a plan for churn (mapping/translation, feature flags, version pinning).
- The OTel project has explicitly called out that dependence on experimental semconv can “trap” instrumentations on pre-release paths, which is one reason stability work matters.
Deprecated conventions
- Promise: still “works,” but you’re being told to move off it because it may be removed later.
- Best practice: keep emitting/accepting them temporarily while you migrate (and mark as deprecated in generated libraries).
Versions: what “semconv vX.Y” means in practice
Semantic conventions are published in versioned sets (e.g., “Semantic Conventions 1.39.0” on the spec site). That version indicates the state of the naming/meaning recommendations at that point in time.
Why this matters for agent-ready telemetry:
- Your agents and pipelines become consumers of those names.
- If different services / languages emit different semconv versions, you’ll get split-brain context (e.g., some emit http.url, others emit url.full, etc.). The HTTP stabilization is a famous example of that kind of breaking rename.
So treat semconv versions like an API dependency:
- pin them
- roll them forward intentionally
- keep translation/mapping capabilities ready
Migration strategy: avoid breaking changes with opt-in + duplication
OpenTelemetry’s most explicit pattern (used for HTTP and also recommended for other promotions like code.*) is:
- Default stays old (no surprise break)
- Add an opt-in switch to emit the new stable conventions
- Offer a duplication mode to emit both old and new for a period
- Eventually, next major versions can drop the old and emit only stable
For HTTP, the recommended env var is:
- OTEL_SEMCONV_STABILITY_OPT_IN=http (emit only new stable)
- OTEL_SEMCONV_STABILITY_OPT_IN=http/dup (emit both old + new for phased rollout)
For code.* attributes, the migration guide recommends the same pattern (code and code/dup).
Why duplication is gold for “agent-ready active telemetry”
Duplication lets you:
- keep existing queries/rules/agents working
- validate that new fields populate correctly
- migrate downstream content (correlation rules, RCA prompts, feature stores) gradually
- measure drift (“what % of traffic has new fields?”)
In an active telemetry pipeline, you can also do duplication at ingestion time: map/rename fields to the target convention while optionally preserving originals for compatibility.
Avoiding breaking changes with opt in and duplication
Where schemas fit: semantic conventions vs schema-based upgrades
When conventions evolve, you need a way to translate between old and new.
OpenTelemetry “Telemetry Schemas” exist to define versioned transformations so telemetry produced under older conventions can be upgraded to newer conventions (e.g., attribute renames) without changing every producer immediately.
Practical takeaway:
- Semantic conventions define what “correct” looks like
- Schemas define how to move from older → newer safely (a migration/translation layer)
For agent-ready context, schemas are your “don’t break the agent” safety net when the real world is messy.
How to keep multi language SDKs aligned
This is the part that quietly makes or breaks interoperability.
1) Generate constants from a single source of truth
OpenTelemetry has guidance for generating semantic convention libraries from the spec/registry, including how to handle deprecated items (so every language ships the same keys/metadata).
This reduces drift like:
- language A exports ATTR_URL_FULL
- language B still prefers http.url
- language C uses custom names
2) Standardize on a “target semconv version” org-wide
Pick a semconv version as your org baseline, and enforce it in:
- instrumentation dependencies
- collector/pipeline processors
- content (dashboards, alerts, agent tools)
3) Add contract tests in CI
Make it automatic:
- validate required attributes exist (service.name, env, HTTP fields, etc.)
- validate units (seconds vs ms) and cardinality rules
- validate no “mystery aliases” creep in
4) Use policy-driven pipelines for normalization
Even with perfect SDK alignment, you’ll have:
- legacy services
- third-party libraries
- random custom instrumentation
Active telemetry pipelines can normalize/rename/enrich to keep the agent-facing contract stable (this is where schemas + transforms shine).
5) Use Weaver (if you want “observability by design”)
OpenTelemetry Weaver is explicitly positioned to help teams define/validate/evolve conventions and keep them consistent and type-safe.
A simple, safe rollout playbook for agent-ready active telemetry
- Choose your target: “We standardize on semconv version X for agent context.”
- Turn on opt-in duplication (*/dup) where supported (HTTP, code.*), or duplicate via pipeline mapping.
- Update consumers first: dashboards, alert rules, RCA automation, and agents should accept new fields (and prefer them).
- Measure adoption: % of spans/logs with new stable fields.
- Flip to new-only once safe.
- Remove deprecated fields later (after retention window + consumer cleanup).
Resource semantic conventions
OpenTelemetry Resource semantic conventions define the standard attributes that describe what is producing telemetry (the entity), as opposed to what happened in a single request/span/log line.
In an AI / agent-ready world, resource conventions matter even more because they provide the stable identity layer that agents use to:
- group signals correctly
- correlate across tools
- reason about blast radius
- apply routing / sampling / cost policies safely
OpenTelemetry describes a Resource as an immutable representation of the entity producing telemetry as attributes.
A Resource is your telemetry’s identity envelope.
It answers:
- Which service is this?
- Where is it running?
- What environment / region?
- What host/container/process?
- Which SDK produced it?
The OpenTelemetry spec provides a dedicated set of resource semantic conventions for consistent naming across teams and vendors.
service.name as the foundation
service.name is the most important Resource attribute because it is the primary key for “who emitted this telemetry.”
It’s the anchor for:
- correlation (trace↔logs↔metrics)
- service maps
- SLOs and error budgets
- agent routing (“which runbook applies?”)
- cost allocation (“who generated volume?”)
OpenTelemetry docs reinforce using semantic conventions for resource attributes, and service.name is the key “service identity” component teams standardize first.
Best practice
- Keep service.name stable (do not include pod IDs, versions, random build hashes, etc.)
- Use other attributes for version / instance identity (see below)
Key service, host, process, cloud, and telemetry attributes
Here are the most important categories of Resource attributes that make telemetry agent-ready (and portable).
A) Service identity
Core:
- service.name (the service)
Common supporting fields:
- service.namespace (grouping: org/team/domain)
- service.version (release version)
- service.instance.id (unique instance; used for per-instance differentiation)
B) Deployment / environment
- deployment.environment.name (e.g., prod, staging)
Notably: OpenTelemetry clarifies that deployment.environment.name does not change service identity uniqueness with service.name / service.namespace / service.instance.id—this is important for cross-env comparisons and portability.
C) Host & runtime placement
Used to tie telemetry back to infrastructure:
- host.name
- host.id
- (often alongside OS/runtime attributes depending on stack)
These are crucial for:
- infra↔service correlation
- node-level incident detection
- noisy neighbor / placement reasoning by agents
D) Process identity
For “what executable produced this?”
- process.pid
- process.executable.name
- process.command
- process.runtime.name / process.runtime.version (language/runtime)
Useful for:
- crash loops / restarts
- host-level attribution
- suspicious runtime drift
E) Cloud identity
This is how you make cloud correlation portable:
- cloud.provider
- cloud.account.id
- cloud.region
- cloud.availability_zone
These unlock:
- region-based incident correlation (“all errors in us-east-1”)
- cost attribution by account/region
- multi-cloud normalization
F) Telemetry SDK identity
These attributes help explain “why telemetry looks like it does”:
- telemetry.sdk.name
- telemetry.sdk.language
- telemetry.sdk.version
Extremely useful in practice for:
- debugging instrumentation gaps
- catching mixed semconv versions
- identifying agents/services emitting “nonstandard” fields
(These are part of the resource conventions set.)
Enrichment patterns that keep cardinality in check
Enrichment is where teams often accidentally create cardinality explosions that:
- increase cost
- slow queries
- reduce metric usefulness
- confuse AI/agents (too many distinct dimensions)
OpenTelemetry explicitly considers high-cardinality risk by using attribute requirement levels, including Opt-In for potentially high-cardinality attributes (especially in metrics).
Here are practical enrichment patterns that keep things tight:
Pattern 1: Put stable identity in Resources, volatile data in spans/logs
Resources should be mostly stable during a process lifetime (service, env, region, cluster).
Good Resource fields:
- service.name
- deployment.environment.name
- cloud.region
- k8s.cluster.name
Avoid as Resource fields:
- request IDs
- user IDs
- session IDs
- full URLs
- stack traces
Those belong in span/log attributes, not resource identity.
Pattern 2: Normalize values upstream (canonicalization)
Before storage:
- map synonyms → canonical attributes (env → deployment.environment.name)
- normalize casing (Prod → prod)
- normalize region names / cluster names
- enforce allowed value sets
This is huge for agents: it prevents “same thing, different spelling” syndrome.
Pattern 3: Controlled duplication for transition periods
When adopting new conventions, duplicate temporarily:
- emit new canonical attribute
- preserve old/custom attribute during migration window
- later drop the old
This avoids breaking dashboards, correlations, and agent tools while you move forward.
Pattern 4: Guardrails for metrics dimensionality
Metrics are the most sensitive to cardinality.
Rules of thumb:
- Metrics dimensions should be bounded and predictable
- If an attribute can take “infinite” values, don’t put it on metrics
- Keep high-cardinality detail for traces/logs only
This aligns with OTel’s guidance that high-cardinality attributes should be opt-in for metrics.
Pattern 5: Tiered enrichment (progressive disclosure)
For agent-ready context, don’t attach everything everywhere.
Instead:
- Always include core identity on every signal (resource)
- Add richer context only where needed:
- error traces
- slow traces
- security-relevant logs
- sampled exemplars
This keeps cost controlled while still preserving full-fidelity context when it matters.
Why this is “agent-ready”
Agents need a consistent, low-noise identity layer to reason safely:
Resource semconv provides that layer by making sure telemetry always answers:
what service, where, what runtime, what cloud, what instrumentation — consistently across teams and vendors.
Trace semantic conventions
Trace semantic conventions are the OpenTelemetry “rules of the road” that make traces portable, comparable, and correlation-ready across services, languages, and tools.
They define:
- how to name spans (so they aggregate meaningfully)
- which attributes to attach (so tools/agents can interpret intent)
- how to represent common operations (HTTP, DB, messaging, etc.)
- what must be set early so sampling and routing decisions don’t discard critical context
What trace semantic conventions are (in plain terms)
Trace semantic conventions standardize the shape of a trace so that:
- a “GET /checkout” span looks like a “GET /checkout” span everywhere
- DB spans expose consistent fields (system, operation, statement, etc.)
- messaging spans expose consistent producer/consumer context
- AI spans are searchable, governable, and comparable (tokens, model, provider, status)
Without these conventions, traces become highly bespoke, and correlation/RCA devolves into custom parsing and heuristics.
Span naming and span kinds that enable comparability
A) Span naming conventions
Span names should be:
- low-cardinality
- operation-centric
- stable across requests
HTTP naming
Good:
- GET /orders
- POST /checkout
Bad:
- GET /orders?userId=123
- checkout for customer 555
Why this matters:
- Span names drive aggregation, dashboards, and anomaly detection
- High-cardinality span names explode storage + destroy comparability
- Agents can’t learn stable patterns if each request name is unique
B) Span kinds
Span kind describes the role of the span in a distributed interaction. Getting this right is huge for accurate service maps and latency attribution.
Common kinds:
- SERVER: the service received a request (e.g., inbound HTTP/RPC)
- CLIENT: the service sent a request (e.g., outbound HTTP/RPC)
- PRODUCER: the service published a message to a broker
- CONSUMER: the service processed a message from a broker
- INTERNAL: in-process work (functions, jobs, business logic)
Why span kind enables comparability:
- It tells tools/agents where latency “belongs”
- It enables correct dependency graphs
- It standardizes causality (who called whom)
HTTP, database, messaging, and AI workload attributes
The conventions define attribute sets per “domain.” Here are the big ones you asked for.
A) HTTP attributes
Use HTTP conventions to describe request/response consistently (across frameworks).
Commonly used attributes:
- http.request.method
- url.scheme, url.path, url.full (tooling varies; url.full is the modern convention)
- server.address, server.port
- http.response.status_code
- user_agent.original
- network.protocol.name / network.protocol.version
Why it matters:
- Comparable latency/error across services
- Consistent RED metrics extraction (Rate, Errors, Duration)
- Strong correlation between span + access logs
Cardinality warning
Avoid placing full query strings or user identifiers into attributes that become dimensions for metrics.
B) Database attributes
DB conventions standardize how DB calls are represented so a query span is consistent whether it’s Postgres, MySQL, MongoDB, etc.
Common attributes:
- db.system (postgresql, mysql, mongodb…)
- db.operation.name (SELECT/INSERT or equivalent operation)
- db.collection.name (for NoSQL)
- db.namespace (database/schema)
- server.address, server.port
Optional/high-risk attributes (use carefully):
- db.query.text (can be high-cardinality + may contain sensitive data)
Why it matters:
- Agents can identify N+1 patterns, slow queries, lock contention
- Helps separate “DB is slow” vs “service is slow”
- Enables portable DB dashboards
C) Messaging attributes
Messaging spans are often the difference between good and terrible distributed tracing in event-driven systems.
Key attributes:
- messaging.system (kafka, rabbitmq, sqs…)
- messaging.destination.name (topic/queue)
- messaging.operation (send/receive/process)
- messaging.message.id (careful: can be high-cardinality)
- messaging.message.conversation_id (if you have it)
Span kinds matter a lot here:
- PRODUCER for publish
- CONSUMER for process
- CLIENT/SERVER for request-reply messaging patterns
Why it matters:
- Lets you trace async workflows end-to-end
- Enables backlog/lag reasoning when combined with metrics
- Helps agents identify systemic broker vs consumer issues
D) AI workload attributes (GenAI / LLM tracing)
This is the newest and fastest-evolving category.
In “agent-ready telemetry,” AI spans should include:
Model + provider identity
- model name/version
- provider (OpenAI, Anthropic, AWS Bedrock, etc.)
Request intent
- operation type (completion, chat, embeddings, tool call)
- endpoint or capability
Usage + cost signals
- tokens in/out
- latency
- retries
- cost estimate (if you compute it)
Safety / governance
- policy decisions
- redaction applied
- error categories (rate limit, content filter, tool failure)
Why it matters:
- Makes AI workloads observable like any other dependency
- Enables cost-aware sampling/routing decisions
- Supports governance (“what data went to which model?”)
- Lets agents troubleshoot agents (tool loops, hallucination patterns, failure modes)
(These AI semantic conventions are still evolving quickly—many teams implement a consistent internal contract aligned to OTel patterns even if the official semconv are still stabilizing.)
Sampling constraints and which attributes must be set early
This is critical and frequently missed.
Sampling (head-based) often happens:
- in SDKs
- at trace start
- before all attributes are known
So the attributes needed for:
- sampling decisions
- routing decisions
- PII handling
- policy enforcement
must be available early—ideally at span start, or even as resource attributes.
Attributes that must be set early (best practice)
Always early: identity
- service.name (resource)
- deployment.environment.name (resource)
- service.version (resource)
- cloud/cluster identity (cloud.region, k8s.cluster.name) if used for routing
Early for inbound request spans
- span kind = SERVER
- operation name (GET /route, POST /route)
- http.request.method
- http.response.status_code (available later, but add as soon as known)
- route template (low-cardinality) rather than raw URL
Early for governance
- tenant / customer tier (bounded values)
- data sensitivity classification (e.g., data.classification=restricted)
- auth principal type (service/user; not actual user id)
Why?
Because sampling often needs to keep:
- all errors
- high-value endpoints
- premium customers
- security events
If those fields arrive late, the trace may already be dropped.
Practical sampling rules that depend on early attributes
- Keep if http.response.status_code >= 500
- Keep if route is “checkout/payments”
- Keep if deployment.environment.name = prod
- Keep if ai.operation = tool_call and error occurred
- Sample 1% of success but 100% of failure
These require that:
- route + kind is correct
- environment is consistent
- operation is consistent
- error status is captured reliably
Trace semantic conventions turn traces from “custom debugging artifacts” into standardized operational data.
They make tracing:
- comparable across teams and services
- correlatable across logs/metrics
- machine-actionable for agents
- cost controllable (through predictable naming + bounded attributes)
Metric semantic conventions
Metric semantic conventions in OpenTelemetry are the standards that make metrics portable, comparable, and safe to aggregate across teams, SDKs, and vendors.
They define:
- metric names (what to call the measurement)
- required vs optional attributes (what dimensions should exist)
- units (so numbers mean the same thing everywhere)
- recommended instrument types (Counter, Histogram, Gauge, etc.)
In the AI era, metric semconv is what keeps your SLOs, dashboards, and agent decisions from becoming “looks right but wrong.”
What Metric semantic conventions are
Metric semantic conventions are documented recommendations in the OpenTelemetry spec for common domains, like:
- HTTP client/server
- RPC
- database
- messaging
- system/runtime
They ensure that “request latency” means the same thing in every service, not:
- latency_ms in one app
- duration in another
- http_time in a third
Naming rules and requirement levels
A) Naming rules
OTel metric names are designed to be:
- descriptive
- domain-scoped
- consistent across languages
- stable across time
Typical pattern:
<domain>.<area>.<measurement>
Examples:
- http.server.request.duration
- rpc.client.duration
- db.client.connections.usage
Naming matters because:
- vendors can ship out-of-the-box dashboards
- teams can write reusable alert rules
- agents can reason across services without custom mapping
B) Requirement levels
Metric semantic conventions include requirement levels for attributes and sometimes metrics themselves (i.e., what you should provide).
Common requirement levels:
- Required: must be present to claim compliance
- Recommended: should be present in most cases
- Opt-In: valuable but potentially costly/risky (often high-cardinality)
Why this exists:
Metrics are aggregation-first. A single bad attribute can:
- blow up cardinality
- increase cost
- make dashboards unusable
So OTel explicitly separates “safe default dimensions” vs “high-cardinality extras.”
Units and instrument types that prevent mismatched dashboards
This is one of the biggest practical wins of metric semantic conventions.
A) Units
Units prevent the classic dashboard trap:
- one service reports seconds
- another reports milliseconds
- charts look consistent but are totally wrong
Semantic conventions specify units like:
- duration: seconds (s)
- size: bytes (By)
- ratios: 1
- counts: {count}
- throughput: By/s, {count}/s
This makes dashboards portable and safe.
B) Instrument types
Metric semconv also aligns the measurement with the right instrument type:
- Counter: strictly increasing count
- examples: request count, error count
- UpDownCounter: value can increase/decrease
- examples: active requests, queue depth
- Histogram: distribution of values
- examples: request durations, payload sizes
- Gauge (via Observable instruments): sampled current value
- examples: CPU utilization, memory usage
Why it matters if you use the wrong instrument type:
- rates become nonsense
- percentiles can’t be computed
- dashboards become misleading
- agents make bad decisions
Example mistake:
- tracking latency with a Counter (wrong)
- tracking request counts with a Histogram (wrong)
Attribute design for low noise, high value metrics
Metrics live or die based on attribute choices. The goal is:
low noise (bounded dimensions)
high value (segments that matter for decisions)
A) What makes a “good” metric attribute
A good metric attribute is:
- low-cardinality (bounded values)
- stable over time
- meaningful for breakdowns and SLOs
Examples of high-value low-cardinality metric attributes:
- service.name (resource attribute — don’t duplicate on metric point)
- deployment.environment.name
- http.request.method (GET/POST/etc.)
- route template (e.g., /checkout/{id} — not full URL)
- http.response.status_code (or class: 2xx/4xx/5xx)
- rpc.system
- db.system
- messaging.system
These enable:
- RED metrics (Rate, Errors, Duration)
- SLO slices (“POST /checkout in prod”)
- fast anomaly detection
- meaningful cost/perf tradeoff decisions
B) What not to put on metrics (high cardinality traps)
Avoid dimensions like:
- user.id
- session.id
- request IDs
- full URL (query strings)
- DB query text
- exception stack traces
These belong in traces/logs, not metrics.
C) Resource vs metric attributes: keep metrics lean
A common anti-pattern is repeating identity fields as metric attributes.
Instead:
- Put identity in Resource attributes
- service.name, cloud.region, k8s.cluster.name
- Keep metric attributes for behavioral dimensions
- method, route, status, system, operation type
This keeps metrics queryable without exploding dimensionality.
D) “Agent-ready” metric design pattern
To make metrics agent-friendly:
- Ensure names + units are standard
- Include only bounded attributes
- Add opt-in attributes only when needed
- Keep trace/log enrichment richer than metrics
Then agents can do things like:
- detect “5xx increase in prod for POST /checkout”
- compare error rates across regions
- choose safe remediation actions
- control sampling/collection policies based on metric signals
Metric semantic conventions exist to make metrics:
- portable across vendors
- mathematically consistent
- dashboard-safe
- low-cost and low-noise
- high-signal for SLOs + automation + agents
Log semantic conventions
OpenTelemetry log semantic conventions are the standard attribute names and patterns that make logs searchable, correlatable, and machine-actionable across teams and tools, without forcing everyone to use the same log format.
They help you turn logs from “strings humans read” into structured events agents can reason over, while still preserving the original message.
What Log semantic conventions are
In OpenTelemetry, a log record typically has:
- Timestamp
- Severity (text + number)
- Body (the human-readable message or structured payload)
- Attributes (key-value pairs)
- Trace context (trace/span IDs)
- Resource attributes (service identity like service.name)
Log semantic conventions standardize which attribute keys to use for common fields so different teams don’t invent dozens of incompatible variations.
Correlating logs to traces and spans
The #1 superpower of OTel logs is native correlation.
How log↔trace correlation works
When logs include trace context, every log record can be linked to:
- the trace it belongs to
- the span that was active when the log was written
That enables workflows like:
- from a trace → instantly see all logs for the failing span
- from a log error → jump to the full request trace
What needs to be present
To correlate consistently, log records should include:
- trace_id
- span_id
- trace_flags (optional but helpful)
…and the Resource attributes that identify where the log came from:
- service.name
- service.instance.id
- deployment.environment.name
Best practice: automatically inject trace context into logs via SDK/logging instrumentation so engineers don’t do it manually.
Why it matters for triage + agents
This unlocks:
- faster root cause analysis (no guessing which request caused the log)
- deterministic correlation (not “string matching” request IDs)
- agents can reconstruct event timelines with high confidence
Preserving original content while adding structure
A common fear is: “If we standardize logs, we’ll lose what developers wrote.”
OTel avoids that by letting you keep raw content while adding structured context.
The pattern: Body + Attributes
- body = original log message (string or structured object)
- attributes = normalized fields for search, correlation, and analytics
So you can preserve:
- the exact original message text
- stack traces / payload snippets (where appropriate)
- developer-friendly phrasing
While still adding structure like:
- service.name
- log.level / severity fields
- http.request.method
- http.response.status_code
- exception.type
Why this is the best of both worlds
- humans still get readable logs
- machines/agents get consistent dimensions
- you can evolve structure without rewriting every log line
Normalization (active telemetry friendly)
In pipelines, you can safely:
- parse JSON where available
- extract fields into canonical semconv attributes
- retain original under something like:
- log.original (or keep it in body)
- redact sensitive content while keeping structured hints
This lets you standardize after the fact.
Exception and feature flag fields for consistent triage
Two areas where conventions dramatically improve triage:
A) Exception fields
Without conventions, exceptions are messy:
- error, err, exception, stack, traceback, msg
OTel semantic conventions standardize exception representation so tools can group errors and power consistent workflows.
Key fields you want consistently:
- exception.type (e.g., NullPointerException)
- exception.message
- exception.stacktrace
Optional but useful:
- exception.escaped (whether exception escaped the scope / crash likelihood)
Why it helps triage
- consistent grouping by exception type
- better error dashboards
- better agent reasoning (“same failure mode across services”)
- easier routing to owning team
B) Feature flag fields
Feature flags are one of the most overlooked causes of “mystery incidents.”
Without conventions, flags show up as:
- random log text
- bespoke keys
- inconsistent naming
OTel includes conventions around feature flags so you can record:
- which flag/provider
- which variant
- the evaluation context (when safe)
Common patterns include:
- flag key/name
- provider name
- variant value (on/off/A/B)
Why this helps
- correlate incidents to deployments and flag rollouts
- identify “only users on variant B are failing”
- enables flag-aware RCA and automated rollback suggestions
Cardinality warning: keep flag attributes bounded (flag name + variant), don’t include user IDs or raw targeting payloads in metric dimensions.
Putting it together: what “good OTel logs” look like
A well-instrumented log record should have:
Resource identity
- service.name
- deployment.environment.name
Correlation
- trace_id, span_id
Severity
- structured level (not just embedded in text)
Body preserved
- original message remains intact
Structured triage attributes
- exception fields when relevant
- http/db/messaging context when relevant
- feature flag name + variant when applicable
This enables:
- fast human triage
- reliable dashboards
- agent-ready context
- lower cost (less brute-force indexing of unstructured text)
Event semantic conventions
OpenTelemetry event semantic conventions are the patterns for representing discrete occurrences inside spans (and sometimes logs) in a consistent way—so they’re searchable, comparable, and usable for automation.
In tracing, an event is a timestamped annotation attached to a span (e.g., “exception thrown”, “message received”, “tool invoked”), with its own name and attributes.
If spans are the “movie,” events are the key frames.
What an event is in OpenTelemetry
A span event includes:
- name (string)
- timestamp
- attributes (structured context)
Common examples:
- exceptions
- retries
- cache invalidations
- feature flag evaluations
- AI tool calls / guardrail decisions
- state transitions inside an operation
Events matter in the AI / agent-ready era because they capture the decision trail inside requests - what changed, what was evaluated, what tool was called, what failed - without exploding span count.
When to use events vs attributes vs bodies
This is the most important design choice.
Use attributes when…
You’re describing stable context about the span/log record:
- things you want available for filtering/aggregation
- values that don’t occur multiple times in the span
- core dimensions that define “what this operation is”
Examples:
- http.request.method
- db.system
- messaging.destination.name
- ai.model
- feature_flag.key (if it’s stable and single)
Rule of thumb:
Attributes = “the tags of this operation.”
Use events when…
You need to record one or more timestamped occurrences during the operation:
- the value can happen multiple times
- order matters
- you want an audit trail of internal steps
- you want to capture why something happened
Examples:
- retry attempt #2
- tool call started / completed
- circuit breaker opened
- cache miss
- guardrail blocked output
- token budget exceeded
- feature flag evaluated → variant chosen
Rule of thumb:
Events = “the timeline of what happened inside the span.”
Use body (log body / event body patterns) when…
You need to preserve raw detail, often human-readable:
- unstructured message
- blob payload (capped)
- a textual stack trace
- model response excerpt (redacted)
Rule of thumb:
Body = “the original record.”
Best practice in agent-ready telemetry
- keep raw content (body) for debugging/forensics
- but extract standardized fields into attributes/events so automation can work
A simple decision matrix
Event naming that supports search and automation
Event naming is often overlooked, but it determines whether events become useful or just noise.
Good event names are:
- stable
- low-cardinality
- verb/action oriented
- domain scoped
- not dynamically generated
Good:
- exception
- retry
- cache.miss
- circuit_breaker.open
- feature_flag.evaluation
- tool.call
- guardrail.blocked
Bad:
- failed to fetch customer 12712
- tool call to getWeather()
- LLM said: ...
Why stable naming matters:
- search works (“show all tool.call events”)
- automation works (“if guardrail.blocked occurs, mark span as risky”)
- agents learn patterns consistently
Pattern suggestion
Use a dot-namespaced format:
<domain>.<action>[.<result>]
Examples:
- ai.tool.call
- ai.tool.result
- ai.guardrail.blocked
- messaging.redelivery
- db.query.retry
Designing event payloads for future standardisation
You want event payloads that are:
- useful today
- compatible tomorrow
- easy to map to OpenTelemetry semconv as it evolves
Principle A: keep payloads structured and small
Use attributes, not giant blobs:
- bounded strings
- booleans
- numeric counters/latency
✅ Better:
- retry.count=2
- retry.reason=timeout
- tool.name=lookup_customer
- tool.status=error
- ai.tokens.input=123
- ai.tokens.output=456
Avoid:
- full prompt text
- full tool payloads
- full model responses (unless redacted + capped)
If you must store raw content:
- put it in log body / span attribute with size caps
- or store externally and link via an ID
Principle B: separate identity from details
Think “header vs payload.”
Event identity (stable keys):
- event.name
- event.domain
- event.outcome (success / failure)
- event.severity
Event details (domain attributes):
- tool.name
- http.response.status_code
- exception.type
- feature_flag.key, feature_flag.variant
This makes it easier to standardize later because the “shape” is predictable.
Principle C: version your custom event payloads
If you create custom event conventions (common in AI workloads), add:
- event.schema.version = "1.0"
Why:
- your pipelines can translate versions
- agents can interpret payloads reliably
- you can migrate safely without breaking queries
Principle D: design for mapping to future OTel semconv
If official conventions might arrive later (AI is a great example), design your custom fields in a way that’s easy to translate:
Use OTel-like naming patterns
- dot notation (ai.*, tool.*, guardrail.*)
- avoid camelCase drift
- be consistent across languages
Be explicit about meaning. Don’t use vague keys like:
- status
- result
- value
Prefer:
- tool.status
- tool.result.type
- guardrail.action
Principle E: prevent cardinality explosions
Events can quietly create cost explosions.
Avoid attributes like:
- user IDs
- request IDs
- full URLs
- arbitrary payloads
- free-form error strings as grouping keys
Instead:
- store stable categories (timeout, rate_limited, validation_failed)
- keep IDs only in spans/log body if needed for debugging
Putting it together: best practice pattern
For agent-ready active telemetry, a clean approach is:
- Use attributes for stable operation context
- Use events for internal steps and decisions
- Preserve raw detail in body (and optionally link to external storage)
- Keep event names stable + payload structured
- Version custom event payloads for migration
This yields events that work for:
- search
- correlation
- automation triggers
- RCA timelines
- future standardization
Enforcing semantic conventions with a telemetry pipeline
Enforcing semantic conventions with a telemetry pipeline is how you turn “best-effort instrumentation” into a reliable, organization-wide telemetry contract.
Instead of hoping every team and SDK emits perfect OpenTelemetry semantic conventions, you enforce them centrally - at ingest - so everything downstream (dashboards, alerts, RCA workflows, agents) sees consistent, agent-ready context.
Why a telemetry pipeline is the right enforcement point
Instrumentation is messy:
- multiple languages + SDK versions
- homegrown logging styles
- third-party libraries with inconsistent keys
- legacy naming (appName, env, requestId)
- partially adopted OTel semconv versions
A pipeline gives you a single control plane to:
- normalize names and types
- enrich with consistent resource context
- reduce cardinality and noise
- route the right data to the right destinations
Normalization at ingest to reduce downstream rework
Normalization at ingest means fix it once and every consumer benefits.
What normalization does
At the pipeline boundary, you standardize:
- attribute names
- value formats
- units
- field location (resource vs span vs log attributes)
- severity levels
- timestamps
- IDs for correlation
Examples of normalization rules
Common attribute mapping
- service / svc / app → service.name
- env / environment → deployment.environment.name
- cluster → k8s.cluster.name
- region → cloud.region
Type normalization
- "200" → 200 for http.response.status_code
- "true" → true for boolean fields
- duration ms → duration s (metrics)
Casing + allowed values
- Prod, production → prod
- Us-East-1, use1 → us-east-1
Why this matters
If you don’t normalize early, every downstream layer ends up re-solving the same problem:
- every dashboard contains OR clauses
- every alert rule duplicates mapping logic
- AI systems hallucinate mappings
- correlation breaks across teams
So normalization is basically “Create one shared language at the edge.”
Transforming legacy attributes into the current schema
This is where pipelines shine: you can run schema translation without rewriting every producer immediately.
The real-world problem
Telemetry in flight will include:
- legacy fields (requestId, hostname, appVersion)
- deprecated semantic conventions
- experimental fields
- pre-stabilization names (common in HTTP semconv evolution)
Migration strategy (safe + practical)
Use a two-phase strategy:
Phase 1 — Translate + duplicate
- map legacy → canonical
- keep the original temporarily
Example:
- keep env
- add deployment.environment.name
Or:
- keep http.url (legacy)
- add url.full (current)
This protects existing content while enabling new standards.
Phase 2 — Cutover + remove
After dashboards/alerts/agents adopt the canonical fields:
- stop emitting / forwarding legacy fields
- reduce storage + indexing waste
Where to apply transformations
You can enforce “schema alignment” in multiple places:
A) Logs
- parse JSON logs into attributes
- extract trace context
- map legacy keys
- standardize exception fields
B) Spans
- normalize span name patterns
- set/repair missing span.kind
- map HTTP/db/messaging attributes to canonical names
C) Metrics
- fix units (ms → s)
- rename metric series to semconv names
- drop or cap high-cardinality dimensions
Why this matters for “agent-ready” telemetry
Agents depend on stable keys. Pipelines let you guarantee:
- service.name always exists
- deployment.environment.name always exists
- HTTP spans always have method/status/route
- exceptions always have type/message/stacktrace
- AI workload spans always include model/provider/tokens
Without a translation layer, your AI systems end up brittle and tool-specific.
Routing clean telemetry to your observability stack and AI systems
Once telemetry is normalized, you can route by policy.
This is the second big advantage of pipelines: semantic conventions make routing rules reliable.
Routing patterns enabled by clean semantics
A) Route by signal type and value
Examples:
- send all error traces (status=ERROR) to premium storage
- send info logs to cheap storage
- send security logs to SIEM
- keep full-fidelity traces for payment/checkout
Because your fields are standardized, routing logic is simple and durable:
- based on service.name
- based on deployment.environment.name
- based on http.response.status_code
- based on exception.type
- based on ai.operation / tool events
B) Route by environment / tenant / compliance
Examples:
- prod logs → retention 30 days
- dev logs → retention 3 days
- restricted data → redaction + limited destinations
Clean resource attributes make this easy:
- deployment.environment.name
- cloud.account.id
- service.namespace
C) Route “agent-ready context” to AI systems
You typically don’t want to send all telemetry into AI/RAG systems.
Instead, create an agent-ready stream:
- low-noise
- normalized
- enriched with stable identity
- minimal sensitive payload content
- event-driven (errors, regressions, anomalies)
For example:
- error traces + key logs + deployment events → incident copilot
- slow spans + top attributes → performance agent
- tool-call spans + guardrail events → AI agent debugging
The key idea: two products from one stream
A modern pipeline produces:
- Observability streams (high fidelity, queryable, retained)
- AI context streams (curated, governed, cost-controlled)
Semantic conventions make those streams consistent and interoperable.
Enforcing semantic conventions with a telemetry pipeline gives you:
- Normalization at ingest → one shared language, less rework downstream
- Schema translation → modern semconv without rewriting everything
- Policy routing → clean telemetry to the right observability + AI systems
In other words: semantic conventions become an enforceable contract, not a suggestion.
Measuring impact with semantic telemetry
A simple scorecard for telemetry readiness
Measuring impact with semantic telemetry means going beyond “we collect signals” to proving that your telemetry is consistent enough to drive outcomes—especially in the AI era, where telemetry becomes agent-ready context.
When telemetry is semantic (standardized names + meanings + consistent structure), you can:
- classify and learn from interactions reliably
- link telemetry quality to business + operational outcomes
- quantify readiness for AI-assisted RCA and automation
Classifying intent, topic, and cognitive complexity from interactions
Semantic telemetry enables reliable classification because events share consistent fields across services and channels. This applies to:
- customer support interactions
- product workflows
- AI assistant sessions
- internal SRE/DevOps workflows
A) Classifying intent
Intent = what the user/agent is trying to achieve.
Examples:
- purchase_attempt
- login_recovery
- change_plan
- refund_request
- incident_triage
- deploy_service
- tool_call:lookup_customer
How semantic telemetry helps
If you standardize attributes like:
- service.name
- event.name
- http.route
- ai.operation
- feature_flag.*
…then intent detection becomes deterministic:
- “all sessions that hit /checkout + payment calls” → purchase intent
- “tool.call events involving billing system” → billing intent
- “spans involving auth reset endpoints” → account recovery intent
B) Classifying topic
Topic = what domain the interaction concerns.
Examples:
- billing
- identity/auth
- shipping
- performance
- fraud
- recommendations
- AI safety/guardrails
How to do it
Build topic inference from stable keys:
- service namespaces (service.namespace=billing)
- route groupings (http.route=/invoice/*)
- DB namespaces / messaging destinations
- log events (event.name=feature_flag.evaluation)
- AI tool chain metadata
Topic classification works best when you avoid free-form text reliance and use structured event keys.
C) Classifying cognitive complexity
Cognitive complexity = how hard the interaction is to complete.
This is extremely valuable for product and AI ops.
A practical model based on telemetry:
- # of steps (span count, workflow stages)
- tool-use depth (retrieval calls, external APIs, retries)
- rework loops (repeated actions, repeated errors)
- handoffs (service boundaries crossed)
- time-to-complete
- error friction (# of 4xx/5xx, validation failures)
- policy friction (guardrail blocks, MFA steps)
You can compute a per-session index like:
Complexity Index = normalized(steps + retries + hops + time + errors)
Semantic telemetry makes this comparable
Without standardized span names, kinds, and attributes, step counts and hop counts become meaningless across teams.
Linking telemetry quality to engagement, retention, and MTTR
This is the “prove it” section: show that better semantic telemetry correlates with better outcomes.
A) Telemetry quality → engagement & retention
For customer/product experiences, semantic telemetry improves:
- funnel accuracy (where users drop)
- segmentation (which cohorts struggle)
- feature adoption measurement
- experiment/feature-flag clarity
Example linkage
- If feature_flag.key + variant are consistent, you can attribute retention changes to rollout variants confidently.
- If checkout spans are comparable (POST /checkout standardized), you can see friction patterns.
Telemetry quality improves decision quality, which improves product iterations, which improves engagement.
B) Telemetry quality → MTTR
Operationally, semantic telemetry reduces MTTR through:
- faster correlation (log↔trace↔metric)
- less manual translation (“what does svc mean here?”)
- fewer false leads (consistent service/resource identity)
- quicker root cause narrative (agents + humans)
You can model this linkage explicitly:
- Higher correlation coverage → faster triage
- Lower field ambiguity → fewer query retries
- Higher trace completeness → fewer “unknown unknowns”
- Consistent ownership tags → faster routing to the right team
C) The key KPI bridge: “time-to-truth”
To connect telemetry quality to outcomes, measure:
- Time to first correlated view
- “how long until responder sees trace+logs+metrics aligned”
- Query iterations per incident
- fewer = better semantic consistency
- % incidents with complete context
- includes service name, env, deployment version, error type, route
These correlate strongly with MTTR and responder efficiency.
A simple scorecard for telemetry readiness
Here’s a lightweight, executive-friendly Semantic Telemetry Readiness Scorecard you can run monthly/quarterly.
Semantic Telemetry Readiness Scorecard (0–100)
A) Identity & Resource Quality (0–20)
- 100% signals include service.name (5)
- deployment.environment.name standardized (5)
- cloud/cluster identity standardized (cloud.region, k8s.cluster.name) (5)
- SDK metadata present (telemetry.sdk.*) (5)
B) Trace Semantic Coverage (0–25)
- ≥90% inbound spans have correct span.kind (5)
- HTTP spans include method + route template + status code (10)
- DB spans include db.system + operation name (5)
- Messaging spans include system + destination + producer/consumer kinds (5)
C) Log Correlation & Triage Structure (0–25)
- ≥80% error logs include trace_id + span_id (10)
- Exceptions use consistent fields (exception.type/message/stacktrace) (10)
- Feature flag evaluation is captured consistently (key + variant) (5)
D) Metrics Consistency (0–15)
- Key metrics use semantic names + correct units (10)
- Metric attributes bounded and low-cardinality (5)
E) Pipeline Enforcement (0–15)
- Normalization mapping is enforced centrally (5)
- Legacy → canonical transformation active (5)
- Routing policies use semantic keys (5)
Interpretation
- 80–100: agent-ready foundation (safe for automation pilots)
- 60–79: usable but expect drift (needs normalization hardening)
- <60: high effort / low trust (agents will struggle; humans will suffer)
To show business impact, pair readiness with outcome metrics:
Product / engagement metrics
- completion rate by intent
- time-to-complete by topic
- drop-offs linked to error/latency spans
- variant-level retention (feature flags)
Ops metrics
- MTTR / MTTD
- time-to-first-correlated-view
- incidents with full context bundle (%)
- “manual correlation required” (% incidents)
Then you can tell a clean story:
As semantic telemetry readiness increases, MTTR decreases and engagement improves because decisions become faster and more correct.
Related Articles
Share Article
Ready to Transform Your Observability?
- ✔ Start free trial in minutes
- ✔ No credit card required
- ✔ Quick setup and integration
- ✔ Expert onboarding support
