In a fast-paced environment, log management empowers engineers to quickly and effectively detect bugs and security vulnerabilities. While it’s sometimes overlooked or considered the sole responsibility of the operations team, having a centralized log management solution that is used by developers, operations, and security engineers will help you build and maintain more resilient and secure applications. It can also help you meet compliance requirements and identify and resolve security issues before they become customer or business-impacting events.
In today’s world, most businesses have applications deployed in the cloud, across multiple cloud environments, or distributed across cloud and on-premises environments. These applications and the systems that they run on, produce a large amount of logs and can quickly become a burden if they aren’t centralized in a single log management solution.
Log management is the process of collecting logs, parsing them, storing them, and making them actionable via search and data visualizations. When properly managed, logs can help teams identify issues, detect threats, and remediate them faster.
The first step in implementing a log management solution is identifying what you’ll use your logs for. Here are some initial questions to ask yourself:
After determining what to log, the next step is determining what type of solution to use. Generally, you have two options: store logs in-house or leverage the cloud and third-party SaaS solutions. Using on-premise resources has its own advantages such as giving administrators complete control over the system. It also keeps logs internally owned, so any downtime from a cloud provider or data breach would not affect internal logs.
There are also downsides to on-premise solutions; it’s expensive to store logs in-house and requires large storage reservoirs. Assuming a fairly small self-hosted stack, a company would need at least one dedicated FTE for standing up a log management solution over a week, with two or three dedicated FTEs more likely needed depending on familiarity and size. After a stack is standing, depending on how large of a system they’re running, they ideally have one to two FTEs for management and operations. The average log storage size for such a stack runs in the tens of gigabytes per day at a minimum. Overall, that means a log management solution in-house could cost on the order of a quarter of a million US dollars or more. Using in-house storage can also be much more difficult to manage and secure.
Here are some reasons to use cloud-based log management solutions:
For any development team, catching bugs before they go to production is critical for software reliability. Log management during the development and testing phase benefits teams with small and large projects as these logged events can catch errors that were missed during unit testing.
It’s not uncommon for developers to implement a basic logging solution using standard automation tools. Homegrown tools can be useful, but it takes long hours from developers who must create and maintain current business applications. In enterprise environments, several out-of-the-box tools are available to developers, but using multiple tools for logging can fragment log file entries across several locations.
To effectively monitor and analyze applications, a centralized solution is necessary to bring several events together so that developers can evaluate the overall picture.
Software bugs aren’t the only reasons that effective logging is necessary. Configurations are another aspect of promotions, and the wrong ones can result in numerous bugs and vulnerabilities. In some cases, the application will crash, requiring an emergency response in which developers and operations must quickly determine the root cause before it becomes a critical revenue-impacting event. Monitoring with log management tools will often find these misconfigurations before they affect user experiences.
The increase in remote work has also increased demand for effective logging. For example, the at-home workforce expansion due to the Coronavirus pandemic increased log entries by 10,000% in some corporate environments (source: devops.com) . Monitoring access requests, authorization failures, successful authorization, and user behavior patterns are necessary for the security of infrastructure and applications.
With more employees working from home and using their own devices to connect to corporate infrastructure in the cloud, shadow IT is a primary concern. Log management is essential in these environments so that suspicious activity can be detected. A Cisco study showed that on average, enterprises use 1200 cloud services and over 98% are shadow IT.
With the sudden Coronavirus lockdowns, IT teams were responsible for rapidly providing developers and other employees remote network connectivity. The deployment pipeline is no longer a process that happens on-premise. It’s now a process where developers and operations people must rapidly deploy from remote locations. Properly implemented log management helps to ensure that anyone with access to these scripts, tools, and infrastructure are authorized and allows anomalies to be detected more quickly.
Log management is an essential tool for any technology organization. Centralizing your logs can help you improve your mean time to detection (MTTD) and mean time to resolution (MTTR) for application bugs, security breaches, and more. When choosing a solution, consider what you’ll use your logs for, how much you’ll log, if you have compliance requirements to meet, and if an on-premise or cloud-based tool is best for you.