Logging Best Practices, Part 5: Structured Logging
8.6.20
Isn't all logging pretty much the same? Logs appear by default, like magic, without any further intervention by teams other than simply starting a system… right?
While logging may seem like simple magic, there's a lot to consider. Logs don't just automatically appear for all levels of your architecture, and any logs that do automatically appear probably don't have all of the details that you need to successfully understand what a system is doing. We've talked about actionable logs, log levels, logs for all components and needs, different methodologies for generating logs, and best practices specifically for text-based logging. Now, let's examine best practices for structured logging.
Structured logs, as noted in a prior article in this series, are intended to make filtering and automation through machines easier. Perhaps you need to get alerted when your disk usage goes over a certain threshold. Maybe you need to find a logline of an error during the first few moments of a production outage. No matter your need, there are some best practices to keep in mind when generating structured logs with your systems.
Use JSON
While there are a lot of options for building a structured log, JSON is by far the most used. As a result, the JSON format is accepted pretty much everywhere. Use JSON when you can. You will deal with fewer headaches when integrating with automation systems, including centralized logging providers, metrics and monitoring, and alerting systems.
Use a Logging Library
Most programming languages offer a logging library that can generate structured logs. As a best practice, use that library rather than trying to build a custom solution. The library will ensure that you have a timestamp and all relevant fields from your specific language and application framework—and only the data that you need.
Some libraries don't have a specific structured logging system set up. That's ok. In general, the more official or more popular libraries will at least have a cookbook or other documentation to explain how to generate structured logs from the built-in classes, methods, or other structures in the library itself. As an example, Python's standard logging library has a cookbook, which includes a section on building a structured logging class. There are also multiple Python libraries that implement structured logging formats, including some that can be significantly faster than the standard library, as we explored in our Engineering blog for serverless logging performance. Regardless of which language you use, you'll almost certainly find a logging library that uses the internals of the language to generate structured logs in the most efficient manner possible.
Include a Human-Readable Message
Despite these logs being intended for machines, it's still worthwhile to include a single human-readable field with tokenized strings using plain, simple, and straightforward language. These messages can be surfaced on your log management system of choice, ensuring that you still get a quick overview of a line without needing to dig into the object itself. For these fields, follow the same best practices as for text-based logging, but there's no need to add too much detail to the message itself since you'll have access to the other fields in the structured log itself.
No matter whether you choose text-based logging or structured logging, you must keep any compliance needs in mind. Most of the time, the compliance requirements are really just enforced best practices for the most secure systems you can build, so we'll explore those needs when it comes to logging in the next, final article in this series.