The Future of Log Management: Cloud Logging, Containers, and Machine Learning

4 MIN READ
MIN READ

"I skate to where the puck is going to be, not where it has been."

- Wayne Gretzky

While computer logs have been around for decades, their implementation has changed over time. From a dedicated server with a single log file, to microservices distributed across multiple virtualized machines with dozens of log files, the way we generate and consume logs changes as we adopt new infrastructure and programming paradigms. Even our market space of cloud-based logging was not heavily adopted until recently, yet the way we use logs will continue to evolve. The million dollar question is, how? What is the future of log management?

The alluring cloud

Cloud-based logging is relatively new and is a result of the general industry trend of moving away from on-premise servers to cloud-based services. In fact, Gartner estimates that between 2016 and 2020, IT will spend more than $1 trillion responding to the shift to the cloud. This means that more and more businesses are moving their operations to the cloud, including their logs. This has some interesting implications.Part of the beauty of moving to the cloud is the ability to easily deploy and scale your infrastructure without having to undertake a large internal infrastructure project. This is a significant reduction in both time and cost, and is central to the value proposition of moving to the cloud. Taking this reasoning further, since there are only a relatively small number of large-scale hosting providers, businesses can be built entirely on making cloud infrastructure management simpler, easier, and more flexible. Enter platform as a service (PaaS).In addition to hosting providers like Amazon Web Services, DigitalOcean, and Microsoft Azure, all sorts of PaaS businesses have popped up, such as Heroku, Elastic Beanstalk, Docker Cloud, Flynn, Cloud Foundry and a whole host of others. These PaaS offerings have become more and more ubiquitous, and are becoming increasingly lucrative. In 2010, Heroku was bought by Salesforce for $212 million, and last year Microsoft attempted to buy Docker for a rumored $4 billion. This demonstrates a significant shift from raw hosting providers, to simplified, managed services that automate the manual grunt work of directly managing cloud infrastructure. All for similar reasons to migrating to the cloud in the first place.So what does this have to do with logging? It means that providing ingestion integrations with PaaS as well as hosting providers becomes increasingly important. Do you integrate with Heroku? Can I send you my Docker logs? I use Flynn, how do I get my logs to you? If you're a cloud logging provider, future of log management the answer to all of these questions should be yes. And don't forget to create integrations as new PaaS offerings appear.

The rise of containers

With the adoption of cloud-based infrastructure, things like distributed microservices architecture have become more popular. One of the primary benefits of using a microservices architecture is its highly modular nature. This means parts can be swapped out quickly and efficiently, without as much risk of disrupting your customers. However, this also means higher risk of development environment fragmentation. This is what containers were built to solve.Containers are essentially wrappers that isolate individual apps within a consistent environment. With containers, it shouldn't matter what hosting provider or development infrastructure you use, your software applications should run exactly the same as they do on your development machines. Matching development and production environments means more reliable testing, and less time spent chasing down environmental issues. It also has the added benefit of being able to reliably run multiple apps inside containers on the same machine, as well as to quickly respond to fluctuating load. According to Gartner, containers are even considered more secure than running apps on a bare OS.

kubernetes containerized applications

While containers in themselves solve the problem of development environment fragmentation, managing lots of individual containers can be a pain. Hence the rise of container orchestration tools, namely Kubernetes. With Kubernetes, you can deploy and manage hundreds of containers, and easily automate nearly everything, including networking and DNS. This is particularly appealing, since managing these at scale using a traditional hosting provider takes significant effort.Between creating a consistent environment and automating networking, adoption of Kubernetes has been steadily increasing, and is poised to become a popular tool of choice. This also means that acquiring and organizing log data from Kubernetes is paramount to understanding the state of your infrastructure. Mezmo, formerly known as LogDNA, provides integrations for this and we strongly believe that containers and container orchestration will become highly prevalent within the next few years.

Machine learning and big data

So far we've primarily focused on infrastructure evolution, but it is also important to consider the trajectory and impact of software trends on logging. We've all heard "big data" and "machine learning" toted as popular buzzword answers to solving difficult software problems, but what do they actually mean?Before we dive deeper into logging applications (no pun intended), let's consider the general benefits of machine learning and big data. For example, Netflix uses machine learning to provide movie recommendations based on the context-dependent aggregate preferences of its users. Google Now uses machine learning to provide you with on-demand pertinent information based on multiple contexts, such as location, time of day, and your browsing habits. In both cases, these services look for patterns in large datasets to predict what information will be useful to you.

big data log management and analysis

Predicting useful patterns is the key value proposition of big data and machine learning, so how does this apply to logs? Since logs enable quicker debugging of infrastructure and code issues, machine learning could be used to notify us of useful patterns that we otherwise may have missed. For example, if I have a web server log and the number of requests spikes or there is an increase in the number of 400 errors, machine learning could be used to notify you of these events before they have a serious impact on your infrastructure. Taken further, machine learning could be used to find relationships between logs, such as tracing a request through multiple servers without explicitly knowing the path of the request beforehand. Given additional contextual inputs, like git commits, deployments, continuous integration, and bug tracking, machine learning could even be used to find relationships between code changes and log statements. Imagine being automatically notified that a particular set of commits is likely responsible for the errors you're seeing in production. Pretty mind-blowing.So why hasn't this been done already? Even though finding general relationships between entities isn't hugely difficult, discerning useful and meaningful insights from unstructured data is actually pretty challenging. How do you find useful relationships in large unstructured datasets consisting primarily of background noise? The answer, however unsexy, is classification and training. As evidenced by both the Netflix and Google Now examples, human recommendation is key to making machine learning insights actually worthwhile, hence the 'learning' part of the name. While it may seem like this initial recommendation effort detracts from the promised convenience of machine learning, it is actually what makes machine learning work so well. Instead of hunting down the data and finding patterns yourself, you are prompted to verify that the insights generated are helpful and correct. As more and more choices are made by us humans, the more useful these machine learning insights become, and the fewer verification prompts we receive. This is the point at which machine learning has fulfilled its highest potential.

Moving forward

From PaaS and containers, to machine learning and big data, we keep all these things in mind as we improve Mezmo to help further the future of log management. Like machine learning, we also rely on recommendation and feedback from our customers. Without it, our product would not be where it is today.

What do you think is the future of logging? Let us know at feedback@mezmo.com!

Table of Contents

    Share Article

    RSS Feed

    Next blog post
    You're viewing our latest blog post.
    Previous blog post
    You're viewing our oldest blog post.
    The Observability Stack is Collapsing: Why Context-First Data is the Only Path to AI-Powered Root Cause Analysis
    Mezmo + Catchpoint deliver observability SREs can rely on
    Mezmo’s AI-powered Site Reliability Engineering (SRE) agent for Root Cause Analysis (RCA)
    What is Active Telemetry
    Launching an agentic SRE for root cause analysis
    Paving the way for a new era: Mezmo's Active Telemetry
    The Answer to SRE Agent Failures: Context Engineering
    Empowering an MCP server with a telemetry pipeline
    The Debugging Bottleneck: A Manual Log-Sifting Expedition
    The Smartest Member of Your Developer Ecosystem: Introducing the Mezmo MCP Server
    Your New AI Assistant for a Smarter Workflow
    The Observability Problem Isn't Data Volume Anymore—It's Context
    Beyond the Pipeline: Data Isn't Oil, It's Power.
    The Platform Engineer's Playbook: Mastering OpenTelemetry & Compliance with Mezmo and Dynatrace
    From Alert to Answer in Seconds: Accelerating Incident Response in Dynatrace
    Taming Your Dynatrace Bill: How to Cut Observability Costs, Not Visibility
    Architecting for Value: A Playbook for Sustainable Observability
    How to Cut Observability Costs with Synthetic Monitoring and Responsive Pipelines
    Unlock Deeper Insights: Introducing GitLab Event Integration with Mezmo
    Introducing the New Mezmo Product Homepage
    The Inconvenient Truth About AI Ethics in Observability
    Observability's Moneyball Moment: How AI Is Changing the Game (Not Ending It)
    Do you Grok It?
    Top Five Reasons Telemetry Pipelines Should Be on Every Engineer’s Radar
    Is It a Cup or a Pot? Helping You Pinpoint the Problem—and Sleep Through the Night
    Smarter Telemetry Pipelines: The Key to Cutting Datadog Costs and Observability Chaos
    Why Datadog Falls Short for Log Management and What to Do Instead
    Telemetry for Modern Apps: Reducing MTTR with Smarter Signals
    Transforming Observability: Simpler, Smarter, and More Affordable Data Control
    Datadog: The Good, The Bad, The Costly
    Mezmo Recognized with 25 G2 Awards for Spring 2025
    Reducing Telemetry Toil with Rapid Pipelining
    Cut Costs, Not Insights:   A Practical Guide to Telemetry Data Optimization
    Webinar Recap: Telemetry Pipeline 101
    Petabyte Scale, Gigabyte Costs: Mezmo’s Evolution from ElasticSearch to Quickwit
    2024 Recap - Highlights of Mezmo’s product enhancements
    My Favorite Observability and DevOps Articles of 2024
    AWS re:Invent ‘24: Generative AI Observability, Platform Engineering, and 99.9995% Availability
    From Gartner IOCS 2024 Conference: AI, Observability Data, and Telemetry Pipelines
    Our team’s learnings from Kubecon: Use Exemplars, Configuring OTel, and OTTL cookbook
    How Mezmo Uses a Telemetry Pipeline to Handle Metrics, Part II
    Webinar Recap: 2024 DORA Report: Accelerate State of DevOps
    Kubecon ‘24 recap: Patent Trolls, OTel Lessons at Scale, and Principle Platform Abstractions
    Announcing Mezmo Flow: Build a Telemetry Pipeline in 15 minutes
    Key Takeaways from the 2024 DORA Report
    Webinar Recap | Telemetry Data Management: Tales from the Trenches
    What are SLOs/SLIs/SLAs?
    Webinar Recap | Next Gen Log Management: Maximize Log Value with Telemetry Pipelines
    Creating In-Stream Alerts for Telemetry Data
    Creating Re-Usable Components for Telemetry Pipelines
    Optimizing Data for Service Management Objective Monitoring
    More Value From Your Logs: Next Generation Log Management from Mezmo
    A Day in the Life of a Mezmo SRE
    Webinar Recap: Applying a Data Engineering Approach to Telemetry Data
    Dogfooding at Mezmo: How we used telemetry pipeline to reduce data volume
    Unlocking Business Insights with Telemetry Pipelines
    Why Your Telemetry (Observability) Pipelines Need to be Responsive
    How Data Profiling Can Reduce Burnout
    Data Optimization Technique: Route Data to Specialized Processing Chains
    Data Privacy Takeaways from Gartner Security & Risk Summit
    Mastering Telemetry Pipelines: Driving Compliance and Data Optimization
    A Recap of Gartner Security and Risk Summit: GenAI, Augmented Cybersecurity, Burnout
    Why Telemetry Pipelines Should Be A Part Of Your Compliance Strategy
    Pipeline Module: Event to Metric
    Telemetry Data Compliance Module
    OpenTelemetry: The Key To Unified Telemetry Data
    Data optimization technique: convert events to metrics
    What’s New With Mezmo: In-stream Alerting
    How Mezmo Used Telemetry Pipeline to Handle Metrics
    Webinar Recap: Mastering Telemetry Pipelines - A DevOps Lifecycle Approach to Data Management
    Open-source Telemetry Pipelines: An Overview
    SRECon Recap: Product Reliability, Burn Out, and more
    Webinar Recap: How to Manage Telemetry Data with Confidence
    Webinar Recap: Myths and Realities in Telemetry Data Handling
    Using Vector to Build a Telemetry Pipeline Solution
    Managing Telemetry Data Overflow in Kubernetes with Resource Quotas and Limits
    How To Optimize Telemetry Pipelines For Better Observability and Security
    Gartner IOCS Conference Recap: Monitoring and Observing Environments with Telemetry Pipelines
    AWS re:Invent 2023 highlights: Observability at Stripe, Capital One, and McDonald’s
    Webinar Recap: Best Practices for Observability Pipelines
    Introducing Responsive Pipelines from Mezmo
    My First KubeCon - Tales of the K8’s community, DE&I, sustainability, and OTel
    Modernize Telemetry Pipeline Management with Mezmo Pipeline as Code
    How To Profile and Optimize Telemetry Data: A Deep Dive
    Kubernetes Telemetry Data Optimization in Five Steps with Mezmo
    Introducing Mezmo Edge: A Secure Approach To Telemetry Data
    Understand Kubernetes Telemetry Data Immediately With Mezmo’s Welcome Pipeline
    Unearthing Gold: Deriving Metrics from Logs with Mezmo Telemetry Pipeline
    Webinar Recap: The Single Pane of Glass Myth
    Empower Observability Engineers: Enhance Engineering With Mezmo
    Webinar Recap: How to Get More Out of Your Log Data
    Unraveling the Log Data Explosion: New Market Research Shows Trends and Challenges
    Webinar Recap: Unlocking the Full Value of Telemetry Data
    Data-Driven Decision Making: Leveraging Metrics and Logs-to-Metrics Processors
    How To Configure The Mezmo Telemetry Pipeline
    Supercharge Elasticsearch Observability With Telemetry Pipelines
    Enhancing Grafana Observability With Telemetry Pipelines
    Optimizing Your Splunk Experience with Telemetry Pipelines
    Webinar Recap: Unlocking Business Performance with Telemetry Data
    Enhancing Datadog Observability with Telemetry Pipelines