What Is OpenTelemetry?

Learning Objectives

• Define OpenTelemetry and explain its core components.

• Define telemetry and list the “three pillars of observability.”

• Explain the benefits of OpenTelemetry to IT and DevOps personnel.

Developers, DevOps engineers, and site reliability engineers (SREs) are responsible for ensuring the health and performance of the applications they develop or oversee, as well as the underlying infrastructure they run on. That’s a challenge in today’s cloud-based, distributed data environments, where capturing and exporting telemetry data is a complex process involving multiple, disparate tools that integrate poorly -- if they integrate at all, resulting in data silos and the poor visibility that comes with them.

Enter OpenTelemetry, an open-source, vendor-agnostic set of APIs, software development kits (SDKs), and other tools for collecting and exporting telemetry data from cloud-native applications and infrastructure. OpenTelemetry provides IT and DevOps personnel with tools that greatly simplify the process of collecting and exporting data from cloud-native applications while creating a single unified standard for service instrumentation.

Understanding Observability, Instrumentation & Telemetry Data

Understanding what OpenTelemetry does and how it benefits organizations requires understanding telemetry data, observability, and instrumentation.

Instrumentation is the ability to monitor and measure performance, detect errors, and obtain trace information representing an application’s state.  At the same time, observability is the practice of measuring system state by system output. Both happen via telemetry data, simply the output from the automatic recording and transmission of data. 

IT and DevOps worlds are most concerned with three primary data classes, often called “the three pillars of observability”:

  1. Logs. A log is a text record of an event that happened at a particular time. When a code execution takes place, the system produces a log entry recording what happened and when. Log data can be in plain text, structured, or unstructured. Most logs are plain text, but structured logs are becoming more popular They include additional metadata and are more easily queried than plain text -- a significant benefit when troubleshooting applications.
  2. Metrics. In contrast with log data, metrics are numeric values that measure events over periods. Metrics come structured by default. In addition to a numeric value, metrics record attributes such as timestamps and other information relevant to the measured event.
  3. Traces. Traces map the end-to-end journeys of processes, such as API calls, as they move through distributed systems. As a request moves through a system, many operations take place on it. Each operation gets encoded with specific data, known as a span. A typical span includes unique identifiers, operation names, events, and other relevant information. Traces give IT and DevOps personnel better insight into how systems connect and provide context for log and metrics data.

OpenTelemetry Under the Hood

OpenTelemetry consists of several components:

  • The core of OpenTelemetry is its API set. These language-specific APIs (Java, .Net, Python, and more) instrument code for data collection.
  • Language-specific SDKs provide a bridge between APIs and exporters and enable additional configuration, such as request filtering and transaction sampling.
  • Exporters decouple instrumentation from the backend configuration, which enables telemetry data transmission to any backend solution. This transmission means that organizations can switch backends without having to re-instrument their code.
  • The Collector is an optional but helpful feature that provides a vendor-agnostic specification for collecting and sending telemetry data. It can deploy as either a standalone process completely separate from the observed application or as an agent residing on the same host as the application.

Organizations can choose to export data simultaneously to multiple observability platforms. OpenTelemetry can even feed data into an AI engine for analysis, subsequently automating the observability process and getting actionable insights to decision-makers faster.

Why Use OpenTelemetry?

Service instrumentation is not a new concept, but telemetry is. While tools to collect telemetry data exist, data formats vary by software provider, leaving organizations vendor-locked while still not having complete visibility into their applications.

By providing an open standard for adding observable instrumentation to cloud-native applications, OpenTelemetry eliminates vendor lock and dramatically simplifies the instrumentation process while providing complete visibility into application telemetry. SREs, developers, and DevOps teams can develop new products and enhance existing applications instead of configuring their service instrumentation. This open standard also encourages contributions from the community, accelerating innovation and bringing new capabilities to bear that enhance the value of the OpenTelemetry standard. 

OpenTelemetry is a Cloud Native Computing Foundation (CNCF) incubator project. It’s among the most active CNCF projects, and many OpenTelemetry enthusiasts compare it to another well-known CNCF project: Kubernetes. Just as Kubernetes standardizes and simplifies container orchestration, OpenTelemetry aims to standardize and streamline application instrumentation.

It’s time to let data charge