How to Evolve Your Existing Logging Strategy for Kubernetes
8.20.20
It's one thing to build a Kubernetes log management strategy that only needs to support Kubernetes. But most organizations don't have that luxury. They have log management practices already in place for other types of platforms or infrastructure, and they need to extend them to support Kubernetes.
How can you do that in an efficient way? Keep reading for tips on integrating Kubernetes logging data into your existing log management workflow without rebuilding from the ground up.
How Kubernetes Logging Is Different
The first step in devising a strategy for supporting Kubernetes through your existing logging workflow is to understand how Kubernetes log data is similar to and different from conventional logs.
Like any other type of modern software platform, Kubernetes creates logs. Specifically, it creates two main types of logs:
- Logs of stdout and stderr messages for each running container. These can help you monitor applications hosted on Kubernetes.
- Logs for the main Kubernetes services (like the API server and Kubelet), which are useful for monitoring Kubernetes itself and the nodes that host it.
These logs are not fundamentally different in type or scope from the logs you already manage for other parts of your infrastructure, such as logs for non-containerized applications and conventional operating systems.
That said, the way Kubernetes approaches logs is different in several key respects:
- Lack of centralization: Kubernetes doesn’t attempt to centralize logs for you or even centralize all log data in the same logfile. Monitoring information is spread across multiple files on Kubernetes master and worker nodes.
- Log access: Kubernetes expects you to access application logs using the kubectl utility. Because of this, unlike on a conventional operating system, you can’t use conventional text-manipulation tools (like grep and awk) to interact with Kubernetes log data unless you access the logs from outside the Kubernetes interface.
- Log rotation: By default, most Kubernetes distributions delete old data in application logs once the size of the log file exceeds 10 megabytes, which is really not a lot if you have lots of applications and therefore lots of log data. Thus, you can’t count on Kubernetes itself to keep your log data around as long as you need it; you must devise your own strategy for exporting the logs and rotating them in a way that aligns with your needs.
- Log structure: Unlike, say, a Linux server, Kubernetes doesn’t care about trying to keep your log data formatted or structured in a neat and consistent way, at least when it comes to application logs. It just records whatever your containers dump to stdout or stderr. Whether that data is standardized and easy to work with depends on the way your containers are configured, not the way Kubernetes collects data from them.
Each of these differences introduces challenges for integrating Kubernetes into existing logging workflows.
Managing Kubernetes Logs Efficiently
Fortunately, those challenges can be solved. The following are some tips for managing Kubernetes log data effectively using your existing log management strategy without the need to run a separate log manager just for Kubernetes.
Don’t Use Kubectl to Manage Logs
Although kubectl lets you view log data, you shouldn’t treat it as your main log management tool. Think of it instead as a quick way to grab recent monitoring data for an individual application, just as you would with a command like head or tail on Linux.
When you need a more insightful and comprehensive way to analyze log data from all of the applications that you have running in Kubernetes, you’ll need to connect log data to a third-party analytics and visualization tool.
Don’t Settle for Stdout and Stderr
Another reason not to depend on kubectl as the foundation of Kubernetes log analytics is that the log data you can access through kubectl is limited, as noted above, to stdout and stderr. Sometimes, you may run an application that has no stdout and stderr for whatever reason. Maybe it was designed to expose monitoring data in another way, in which case Kubernetes won’t capture it. Or maybe the application is not configured to be verbose enough to generate meaningful messages to stdout or stderr.
One common approach that lets you avoid these limitations is to run a logging agent. You can deploy an agent as a node-level system, with a DaemonSet, or as a sidecar container (or containers). The agent collects log data from the application in whichever form the application exposes it. This strategy not only allows you potentially to collect more logging data but also makes it easy to run the same logging agent inside your Kubernetes cluster that you use for the rest of your environment.
Standardize Logs
Because Kubernetes itself doesn’t attempt to structure log data in a consistent way, you can end up with a mess if you attempt to analyze logs from a Kubernetes environment alongside logs from other systems without tools that can interpret the Kubernetes logs effectively.
That’s why standardizing your Kubernetes logs is so critical. You can do this by exporting log data to a log manager that supports all common logging formats and then querying logs from the log manager. This is much more efficient than attempting to use kubectl to interact with log data that may not be structured consistently.
Simplify Kubernetes Log Agent Deployment
Reading the above, you might be thinking that an enormous amount of manual effort is required to set up a log management solution for Kubernetes that also works for other types of systems. And it is, if you attempt to create logging sidecars yourself to host a logging agent system or run a node-level agent.
It’s much easier if you use a solution like Mezmo, formerly known as LogDNA, that offers prebuilt Kubernetes agents that you can deploy with a few simple commands instead of sidecars. This way, you can easily use the same log management tooling and workflow for Kubernetes that you use for other parts of your stack.
Conclusion
Kubernetes is complicated enough without having to develop a bespoke log management strategy for it. To help reduce your Kubernetes management overhead, adopt log management tools and strategies that are based on those you already have in place, instead of ones that require you to develop a separate logging operation just for Kubernetes.