What is Structured Logging?

Understand why you need structured logging
Learn examples of data necessary for log analysis
Explore common log structures

‍

As humans, we understand sentences and words based on their grammatical and spelling structure. Machines also need formatted input to read and understand instructions. Any errors in the way input is structured could cause bugs or unknown consequences in the same way humans can’t always parse an incorrectly formatted sentence. Structured logging is essential for generating reliable security event logs and for building a telemetry pipeline that enables deeper analytics. These are formatted messages (e.g., application errors or server messages) that can be read and parsed by analytics applications to provide visualized output to the reader. This reader could be server administrators, security analysts, engineers, or any other individual responsible for making decisions based on the parsed data.

As an example, the following image is a structured log record:

‍

‍

Why use structured logging?

Without structured logging, organizations would not be able to use a standard analytics tool to view data. They would need to build their own solution or be limited to only applications that can read non-standard structures, and it’s possible that building a custom solution will fail. Formatted data in log files gives organizations the ability to choose any SIEM tools or log aggregation platform that supports the stored structure. Many of them support standard structures out-of-the-box, so no additional work is needed to support logs.

There are three major issues with unstructured event logs:

Non-standard formatting requires a customized solution to read the logged events. Any parsing of the data must use a customized solution.
Should administrators need to read the raw data files, it can be difficult to read if the structure is unknown.
Should any other applications need to consume the logged data, they would not be able to work without additional support.

Structured logs also give developers a way to write customized applications using open-source third-party libraries that already support a standard structure. This saves on development time without building a tool from scratch to consume data.

‍

What data is included in structured logs?

Logs are only as useful as the information that they contain. The information included in an event is what is used to create dashboards, graphs, charts, algorithm analysis, and any other useful information that can be used to determine the health of the environment. These structured logs make searching for specific events more efficient. With parsing applications such as log analysis tools, structured logs can use a grok pattern to extract fields, and may include any number of data points in a single event.

Some examples of information that can be included in an event:

The date and time the event happened
The type of action triggered (e.g., informational, critical, warning, error, etc)
The location of the triggered event (e.g., an API endpoint or running application)
A description of the event (e.g., a credit card failure could be logged to detect potential fraud)
A unique event ID
The customer ID or username
Protocol used to access the application
The port used to execute a function

Logged events should be structured to support ingestion into modern telemetry pipelines, but not all information should be stored in logs. For example, never store passwords, secrets, keys, or any other sensitive information in logs. These logs should be locked down so that only authorized users can access them, but in a cybersecurity event where an attacker can escalate privileges, these logs could be used in a data breach.

‍

What are examples of structured log events?

The formatted information in structured logs are typically in JSON. JSON is a standard format across different operating systems and environments, so it’s easy to share data across multiple platforms (e.g., Linux and Windows). Older structures might use XML formatting. Most logging applications either in the cloud or on-premise will use standard structures so that administrators can aggregate logs in a single location and use an application such as a SIEM tool to parse the data and turn security event logs into actionable insights.

Most SIEMs and other applications will automatically parse standard structures, but it is still beneficial for organizations and developers to know the structure should they decide to create their own applications that will consume the data.

The following is an example of a JSON log entry:

{
"User": "customer-43683",
"Level": "INFO",
"EventTime": "2021-2-20 11:20:30",
"Hostname": "web-server1",
"SourceName": "ecommerce-application",
"ProcessID": 9485,
"AuthenticationMethod": "Windows",
"Reason": "incorrect password",
"SourceIPAddress": "10.0.1.155",
"SourcePort": 80,
"Protocol": "HTTP"
}

Taking the above JSON entry, you could assume that the entry is a failed authentication attempt from a specific customer accessing a web application on port 80 using HTTP or possibly a 401 authentication failure. Using this entry alone, you do not know if this was from a malicious authentication request or a standard user who attempted to authenticate into the application incorrectly. However, if you had numerous events that displayed the same information, it could be malicious intent that could later be reviewed by a security analyst.

The above format is JSON, but here is the same event in XML:

<event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"></event>
<system></system>
<provider name="Microsoft-Windows-Security-Auditing"></provider>
<eventid>54849625</eventid>
<level>INF</level>
<timecreated systemtime="2021-2-20 11:20:30"></timecreated>
<hostname>web-server1</hostname>

<eventdata></eventdata>
<data name="User">customer-43683</data>
<data name="Level">INF</data>
<data name="SourceName">ecommerce-application</data>
<data name="ProcessID">9485</data>
<data name="AuthenticationMethod">Windows</data>
<data name="Reason">incorrect password</data>
<data name="SourceIPAddress">10.0.1.155</data>
<data name="SourcePort">80</data>
<data name="Protocol">HTTP</data>

As you can see above, the same information is contained within the XML log entry, but the structure and syntax are different. If the data cannot be parsed, it would be useless to the application and unusable.

Any logging solution chosen should support the structure that you need. Before choosing a SIEM, ensure that your application writes logs in a format that you need and your logging solution parses data efficiently for integration with telemetry pipelines and downstream analysis.

Table of Contents

Related Lessons

Share Article

Ready to Transform Your Observability?

Experience the power of Active Telemetry and see how real-time, intelligent observability can accelerate dev cycles while reducing costs and complexity.

✔ Start free trial in minutes
✔ No credit card required
✔ Quick setup and integration
✔ Expert onboarding support

Start free trial Schedule demo