Why Culture and Architecture Matter with Data, Part I

4 MIN READ
4 MIN READ

We are using data wrong. 

In today’s data-driven world, we have learned to store data. Our data storage capabilities have grown exponentially over the decades, and everyone can now store petabytes of data. Let’s all collectively pat ourselves on the back. We have won the war on storing data! 

Congratulations!

Does it really feel like we have won, though? Or does it feel like we are losing because the goal was never to store data? The goal was to use our data, hopefully at increasing speeds, for such use cases as product improvement, increased profits, lower costs, increased customer satisfaction, user experience, lower downtime, security, and so much more. Now most of us realize that innovation costs are the actual consequence of not leveraging our data. 

Studies and surveys paint a grim picture of current enterprise efforts related to data. According to Harvard Business Review, "Cross-industry studies show that on average, less than half of an organization's structured data is actively used in making decisions—and less than 1% of its unstructured data is analyzed or used at all." 

Put simply, we want to use the data for our benefit but have not yet figured out how. Once data is stored, it might as well fall into a black hole, never to be seen again. To this point, the promises of Big Data have gone unfulfilled.

Does this deter companies from continuing to invest in data initiatives, though? Not according to 91% of those surveyed by NewVantage Partners, where businesses reported they were increasing the amount of money spent on data initiatives. This continued commitment has been a growing trend for over a decade. 

Doing the Same Thing Over and Over Again, Expecting a Different Result

In the face of so much failure, why are companies dedicating significant time, resources, and people to chasing what feels like data insanity? A straightforward answer is that the precedent was set by a small group of companies using data correctly. They were the first movers who learned the lessons first. These companies and their stories from afar can feel like magic to the companies that have yet to accomplish the same success. The stories of how successful companies leveraged their data are the stuff of legend and can both inspire while also creating fear of being left behind. 

For example, one company in the entertainment space put its largest competitor out of business while simultaneously building a new business model. It harnessed the power of its data by becoming data-driven while enabling its employees, through a strong data culture, to create a data architecture and tooling to find critical insights and make important business decisions in a fraction of the time of their competitors. 

Another company in the retail space adopted a data-driven approach, and a modern cloud architecture, to burst needed computing resources into their cloud provider during the busy holiday season, saving tens of millions of dollars annually. 

Data-driven companies innovate faster, and losing out on innovation to your competition is the death knell for a business. Therefore billions of dollars spent annually in the enterprise on data platforms, architectures, and people is not insanity but the requirement to evolve companies into modern businesses. It’s the cost needed to pay for innovation.

Enterprise Data Challenges

Companies are putting up the capital and have the desire to become data-driven. So what are examples of the challenges that prevent them from using their mountain of data? 



The first challenge to call out is the traditional data architecture which primarily focuses on moving data horizontally from source to destination with a data pipeline. These architectures make transporting and storing data more efficient but do little to extract business value from the data. A longstanding problem with this architecture is siloed ownership of the data at the destination to the teams that create and primarily use the data when the goal should be to make it easy to access for any business use case and internal team. While this architecture was suitable for transporting increasing data volumes and a necessary step in the data architecture evolution, it did little to address the complexities for internal teams to work with data or their desire to pull data from across data domains.

Next up, closely associated with the first, is how data in the enterprise is treated more as an object than a business asset. In the last fifteen years, we've created modern, highly distributed cloud architectures that horizontally scale to load and have forever changed engineering team design, cultures, and operations to support them. And yet, for the last sixty years, we have continued to store data in legacy monoliths. Telemetry, customer, marketing, and sales data go into separate silos. This separation of the data prevents finding valuable business insights into how a decision made in one part of the company affects another. This siloed data, as the foundation of the business data domains, prevents or hinders the company from using valuable correlation across those domains to glean critical insights to propel the business. 

Next is how to effectively enable the people responsible for the engineering lifecycle of data from data creation to consumption. The concept of data engineering isn’t new but has become challenging based on modern cloud architectures, data volume increases, and the different expectations created for the data. In the everyday engineering world, trying to work with data at a cloud scale is arguably known by few. Just working with data is hard enough without throwing in the myriad of technologies underlying the pipeline architecture. Data, pipeline, and technologies problems mixed in with the necessary skill set and complex troubleshooting tasks are a tall order for any one person to be responsible. And I didn’t even mention security, compliance, and data governance… the scary bogeymen in the room that can paralyze decision-making for data use cases.

There are also the challenges of team scale. The larger the enterprise, the higher the need for technical specialization in one or more of the expertise needed to work with data. And this leads to entire teams who uniquely understand their piece in the pipeline ecosystem while being frustrated by their lack of awareness of the problems related to data elsewhere in their team. When issues arise with the pipeline/data/correlation, who owns which piece? At a smaller enterprise scale, without all the specialized groups, it is hard to hire people who understand all the challenges and have the necessary skills to work with data. Whether the team is large or small, there are unique challenges with either.

Above all of these challenges, there is the fact that people, data, and technology alone do not ensure better decisions unless one component revolves around supporting them: the data-driven culture. 

Every company desires to be more data-driven, but a new approach to data, pipeline architectures, and the people operating inside them need help and support. In recent memory, a cultural shift was necessary with the DevOps & SRE movement to implement changes to support distributed architectures. Likewise, companies need a culture shift to achieve the success they desire with their data.

While these challenges can feel daunting to the teams and individuals who live them daily, there is hope. Primarily coming from the companies and even individuals who have addressed data problems with new patterns, architectures, and norms which are forming to raise the expectations of what it means to work with data. In my next article, I'll go over what is happening with the customer, technology, and startup world to address the challenges of working with data.

Table of Contents

    Share Article

    RSS Feed

    Next blog post
    You're viewing our latest blog post.
    Previous blog post
    You're viewing our oldest blog post.
    Mezmo + Catchpoint deliver observability SREs can rely on
    Mezmo’s AI-powered Site Reliability Engineering (SRE) agent for Root Cause Analysis (RCA)
    What is Active Telemetry
    Launching an agentic SRE for root cause analysis
    Paving the way for a new era: Mezmo's Active Telemetry
    The Answer to SRE Agent Failures: Context Engineering
    Empowering an MCP server with a telemetry pipeline
    The Debugging Bottleneck: A Manual Log-Sifting Expedition
    The Smartest Member of Your Developer Ecosystem: Introducing the Mezmo MCP Server
    Your New AI Assistant for a Smarter Workflow
    The Observability Problem Isn't Data Volume Anymore—It's Context
    Beyond the Pipeline: Data Isn't Oil, It's Power.
    The Platform Engineer's Playbook: Mastering OpenTelemetry & Compliance with Mezmo and Dynatrace
    From Alert to Answer in Seconds: Accelerating Incident Response in Dynatrace
    Taming Your Dynatrace Bill: How to Cut Observability Costs, Not Visibility
    Architecting for Value: A Playbook for Sustainable Observability
    How to Cut Observability Costs with Synthetic Monitoring and Responsive Pipelines
    Unlock Deeper Insights: Introducing GitLab Event Integration with Mezmo
    Introducing the New Mezmo Product Homepage
    The Inconvenient Truth About AI Ethics in Observability
    Observability's Moneyball Moment: How AI Is Changing the Game (Not Ending It)
    Do you Grok It?
    Top Five Reasons Telemetry Pipelines Should Be on Every Engineer’s Radar
    Is It a Cup or a Pot? Helping You Pinpoint the Problem—and Sleep Through the Night
    Smarter Telemetry Pipelines: The Key to Cutting Datadog Costs and Observability Chaos
    Why Datadog Falls Short for Log Management and What to Do Instead
    Telemetry for Modern Apps: Reducing MTTR with Smarter Signals
    Transforming Observability: Simpler, Smarter, and More Affordable Data Control
    Datadog: The Good, The Bad, The Costly
    Mezmo Recognized with 25 G2 Awards for Spring 2025
    Reducing Telemetry Toil with Rapid Pipelining
    Cut Costs, Not Insights:   A Practical Guide to Telemetry Data Optimization
    Webinar Recap: Telemetry Pipeline 101
    Petabyte Scale, Gigabyte Costs: Mezmo’s Evolution from ElasticSearch to Quickwit
    2024 Recap - Highlights of Mezmo’s product enhancements
    My Favorite Observability and DevOps Articles of 2024
    AWS re:Invent ‘24: Generative AI Observability, Platform Engineering, and 99.9995% Availability
    From Gartner IOCS 2024 Conference: AI, Observability Data, and Telemetry Pipelines
    Our team’s learnings from Kubecon: Use Exemplars, Configuring OTel, and OTTL cookbook
    How Mezmo Uses a Telemetry Pipeline to Handle Metrics, Part II
    Webinar Recap: 2024 DORA Report: Accelerate State of DevOps
    Kubecon ‘24 recap: Patent Trolls, OTel Lessons at Scale, and Principle Platform Abstractions
    Announcing Mezmo Flow: Build a Telemetry Pipeline in 15 minutes
    Key Takeaways from the 2024 DORA Report
    Webinar Recap | Telemetry Data Management: Tales from the Trenches
    What are SLOs/SLIs/SLAs?
    Webinar Recap | Next Gen Log Management: Maximize Log Value with Telemetry Pipelines
    Creating In-Stream Alerts for Telemetry Data
    Creating Re-Usable Components for Telemetry Pipelines
    Optimizing Data for Service Management Objective Monitoring
    More Value From Your Logs: Next Generation Log Management from Mezmo
    A Day in the Life of a Mezmo SRE
    Webinar Recap: Applying a Data Engineering Approach to Telemetry Data
    Dogfooding at Mezmo: How we used telemetry pipeline to reduce data volume
    Unlocking Business Insights with Telemetry Pipelines
    Why Your Telemetry (Observability) Pipelines Need to be Responsive
    How Data Profiling Can Reduce Burnout
    Data Optimization Technique: Route Data to Specialized Processing Chains
    Data Privacy Takeaways from Gartner Security & Risk Summit
    Mastering Telemetry Pipelines: Driving Compliance and Data Optimization
    A Recap of Gartner Security and Risk Summit: GenAI, Augmented Cybersecurity, Burnout
    Why Telemetry Pipelines Should Be A Part Of Your Compliance Strategy
    Pipeline Module: Event to Metric
    Telemetry Data Compliance Module
    OpenTelemetry: The Key To Unified Telemetry Data
    Data optimization technique: convert events to metrics
    What’s New With Mezmo: In-stream Alerting
    How Mezmo Used Telemetry Pipeline to Handle Metrics
    Webinar Recap: Mastering Telemetry Pipelines - A DevOps Lifecycle Approach to Data Management
    Open-source Telemetry Pipelines: An Overview
    SRECon Recap: Product Reliability, Burn Out, and more
    Webinar Recap: How to Manage Telemetry Data with Confidence
    Webinar Recap: Myths and Realities in Telemetry Data Handling
    Using Vector to Build a Telemetry Pipeline Solution
    Managing Telemetry Data Overflow in Kubernetes with Resource Quotas and Limits
    How To Optimize Telemetry Pipelines For Better Observability and Security
    Gartner IOCS Conference Recap: Monitoring and Observing Environments with Telemetry Pipelines
    AWS re:Invent 2023 highlights: Observability at Stripe, Capital One, and McDonald’s
    Webinar Recap: Best Practices for Observability Pipelines
    Introducing Responsive Pipelines from Mezmo
    My First KubeCon - Tales of the K8’s community, DE&I, sustainability, and OTel
    Modernize Telemetry Pipeline Management with Mezmo Pipeline as Code
    How To Profile and Optimize Telemetry Data: A Deep Dive
    Kubernetes Telemetry Data Optimization in Five Steps with Mezmo
    Introducing Mezmo Edge: A Secure Approach To Telemetry Data
    Understand Kubernetes Telemetry Data Immediately With Mezmo’s Welcome Pipeline
    Unearthing Gold: Deriving Metrics from Logs with Mezmo Telemetry Pipeline
    Webinar Recap: The Single Pane of Glass Myth
    Empower Observability Engineers: Enhance Engineering With Mezmo
    Webinar Recap: How to Get More Out of Your Log Data
    Unraveling the Log Data Explosion: New Market Research Shows Trends and Challenges
    Webinar Recap: Unlocking the Full Value of Telemetry Data
    Data-Driven Decision Making: Leveraging Metrics and Logs-to-Metrics Processors
    How To Configure The Mezmo Telemetry Pipeline
    Supercharge Elasticsearch Observability With Telemetry Pipelines
    Enhancing Grafana Observability With Telemetry Pipelines
    Optimizing Your Splunk Experience with Telemetry Pipelines
    Webinar Recap: Unlocking Business Performance with Telemetry Data
    Enhancing Datadog Observability with Telemetry Pipelines
    Transforming Your Data With Telemetry Pipelines