The Future of Log Management: Cloud Logging, Containers, and Machine Learning
"I skate to where the puck is going to be, not where it has been."
- Wayne Gretzky
While computer logs have been around for decades, their implementation has changed over time. From a dedicated server with a single log file, to microservices distributed across multiple virtualized machines with dozens of log files, the way we generate and consume logs changes as we adopt new infrastructure and programming paradigms. Even our market space of cloud-based logging was not heavily adopted until recently, yet the way we use logs will continue to evolve. The million dollar question is, how? What is the future of log management?
The alluring cloud
Cloud-based logging is relatively new and is a result of the general industry trend of moving away from on-premise servers to cloud-based services. In fact, Gartner estimates that between 2016 and 2020, IT will spend more than $1 trillion responding to the shift to the cloud. This means that more and more businesses are moving their operations to the cloud, including their logs. This has some interesting implications.Part of the beauty of moving to the cloud is the ability to easily deploy and scale your infrastructure without having to undertake a large internal infrastructure project. This is a significant reduction in both time and cost, and is central to the value proposition of moving to the cloud. Taking this reasoning further, since there are only a relatively small number of large-scale hosting providers, businesses can be built entirely on making cloud infrastructure management simpler, easier, and more flexible. Enter platform as a service (PaaS).In addition to hosting providers like Amazon Web Services, DigitalOcean, and Microsoft Azure, all sorts of PaaS businesses have popped up, such as Heroku, Elastic Beanstalk, Docker Cloud, Flynn, Cloud Foundry and a whole host of others. These PaaS offerings have become more and more ubiquitous, and are becoming increasingly lucrative. In 2010, Heroku was bought by Salesforce for $212 million, and last year Microsoft attempted to buy Docker for a rumored $4 billion. This demonstrates a significant shift from raw hosting providers, to simplified, managed services that automate the manual grunt work of directly managing cloud infrastructure. All for similar reasons to migrating to the cloud in the first place.So what does this have to do with logging? It means that providing ingestion integrations with PaaS as well as hosting providers becomes increasingly important. Do you integrate with Heroku? Can I send you my Docker logs? I use Flynn, how do I get my logs to you? If you're a cloud logging provider, future of log management the answer to all of these questions should be yes. And don't forget to create integrations as new PaaS offerings appear.
The rise of containers
With the adoption of cloud-based infrastructure, things like distributed microservices architecture have become more popular. One of the primary benefits of using a microservices architecture is its highly modular nature. This means parts can be swapped out quickly and efficiently, without as much risk of disrupting your customers. However, this also means higher risk of development environment fragmentation. This is what containers were built to solve.Containers are essentially wrappers that isolate individual apps within a consistent environment. With containers, it shouldn't matter what hosting provider or development infrastructure you use, your software applications should run exactly the same as they do on your development machines. Matching development and production environments means more reliable testing, and less time spent chasing down environmental issues. It also has the added benefit of being able to reliably run multiple apps inside containers on the same machine, as well as to quickly respond to fluctuating load. According to Gartner, containers are even considered more secure than running apps on a bare OS.
While containers in themselves solve the problem of development environment fragmentation, managing lots of individual containers can be a pain. Hence the rise of container orchestration tools, namely Kubernetes. With Kubernetes, you can deploy and manage hundreds of containers, and easily automate nearly everything, including networking and DNS. This is particularly appealing, since managing these at scale using a traditional hosting provider takes significant effort.Between creating a consistent environment and automating networking, adoption of Kubernetes has been steadily increasing, and is poised to become a popular tool of choice. This also means that acquiring and organizing log data from Kubernetes is paramount to understanding the state of your infrastructure. Mezmo, formerly known as LogDNA, provides integrations for this and we strongly believe that containers and container orchestration will become highly prevalent within the next few years.
Machine learning and big data
So far we've primarily focused on infrastructure evolution, but it is also important to consider the trajectory and impact of software trends on logging. We've all heard "big data" and "machine learning" toted as popular buzzword answers to solving difficult software problems, but what do they actually mean?Before we dive deeper into logging applications (no pun intended), let's consider the general benefits of machine learning and big data. For example, Netflix uses machine learning to provide movie recommendations based on the context-dependent aggregate preferences of its users. Google Now uses machine learning to provide you with on-demand pertinent information based on multiple contexts, such as location, time of day, and your browsing habits. In both cases, these services look for patterns in large datasets to predict what information will be useful to you.
Predicting useful patterns is the key value proposition of big data and machine learning, so how does this apply to logs? Since logs enable quicker debugging of infrastructure and code issues, machine learning could be used to notify us of useful patterns that we otherwise may have missed. For example, if I have a web server log and the number of requests spikes or there is an increase in the number of 400 errors, machine learning could be used to notify you of these events before they have a serious impact on your infrastructure. Taken further, machine learning could be used to find relationships between logs, such as tracing a request through multiple servers without explicitly knowing the path of the request beforehand. Given additional contextual inputs, like git commits, deployments, continuous integration, and bug tracking, machine learning could even be used to find relationships between code changes and log statements. Imagine being automatically notified that a particular set of commits is likely responsible for the errors you're seeing in production. Pretty mind-blowing.So why hasn't this been done already? Even though finding general relationships between entities isn't hugely difficult, discerning useful and meaningful insights from unstructured data is actually pretty challenging. How do you find useful relationships in large unstructured datasets consisting primarily of background noise? The answer, however unsexy, is classification and training. As evidenced by both the Netflix and Google Now examples, human recommendation is key to making machine learning insights actually worthwhile, hence the 'learning' part of the name. While it may seem like this initial recommendation effort detracts from the promised convenience of machine learning, it is actually what makes machine learning work so well. Instead of hunting down the data and finding patterns yourself, you are prompted to verify that the insights generated are helpful and correct. As more and more choices are made by us humans, the more useful these machine learning insights become, and the fewer verification prompts we receive. This is the point at which machine learning has fulfilled its highest potential.
From PaaS and containers, to machine learning and big data, we keep all these things in mind as we improve Mezmo to help further the future of log management. Like machine learning, we also rely on recommendation and feedback from our customers. Without it, our product would not be where it is today.