5 Things Developers Need to Know About Kubernetes Management

4 MIN READ
MIN READ

LogDNA is now Mezmo but the insights you know and love are here to stay.

This blog post was originally published on The New Stack on September 13, 2021. View the original here.

Kubernetes management can be daunting for developers who don’t have specialized understanding of the orchestration technology. Learning Kubernetes takes practice and time, a precious commodity for devs who are under pressure to deliver new applications. This post provides direction on what you need to know and what you can skip to take advantage of Kubernetes. Let’s start with five things you need to know.

1. Kubernetes Naming Conventions

As a developer, understanding that objects created in Kubernetes must follow specific naming conventions will help you interpret insights more quickly. Knowing how these work and why something is named a specific way will help to make correlations when something might be wrong with the application or underlying infrastructure. Each Kubernetes object has a name that’s unique for a type of resource and a uniquely identified object (UID) that’s unique across an entire cluster. Kubernetes.io does a great job of explaining this by saying, “You can only have one Pod named myapp-1234 within the same namespace, but you can have one Pod and one Deployment that are each named myapp-1234.” In addition, pods that have the same higher-level controller (e.g. ReplicaSets/Deployments) will inherit the same naming prefix. For example, if your deployment name is ‘webapp,’ a corresponding ReplicaSet could be webapp-12345678, and the corresponding pod under that could be webapp-12345678-abcde. This way, every pod in this namespace has a unique name, but it is to tell which pods share the same controller and thus the same specification.

2. How to Get Your Application to Run

New Kubernetes users should have a basic understanding of services and deployments as they factor into running applications.

Services are just a general way to route network traffic within the cluster, and sometimes from outside too. It’s fairly easy to define and allows easily human-readable configuration for where traffic goes. An example would be a service named ‘webapp’ that routes traffic to pods with the label ‘app: web.’ Any pod in the same namespace can make calls to the webapp pods by making a request to the URL ‘http://webapp’. Traditionally, in setting up networking, there are a number of steps that go into ensuring traffic routes to the right place, but using services in Kubernetes enables you to give directions at a higher level and the resulting action is automated.

Deployments are the simplest high-level controller. In stateless applications, it is enough for developers to understand that deployments will handle rolling updates with simple instructions, such as how many copies you want and what version you’re on. This ensures that your application connects to the correct endpoints; everything in between happens automatically. For more complex applications, it may be worth exploring some of the other high-level controllers, like StatefulSets and Cron jobs.

It is also helpful to understand the main principle of containers: How do I get to a consistent state? If an update or action causes an application to get into a bad state, just restarting the container resets it. The container may not be on the same node, and you might not have the same set of interactions, but ultimately the most important things are the same as when you started.

3. How to Get Information About Your Application

Regardless of what infrastructure you are running on, this is what most developers care about. What went wrong, where did it go wrong, how critical is it and is it my problem? That last question may seem a bit cynical, but knowing which team and developer are best suited to solve a problem is crucial for operational efficiency. The primary questions you need answered are: How do I tell if my application is running, and what does resource consumption look like?

A helpful point to note is that employing the command “kubectl get events” will provide you with all of your application events. Unfortunately, Kubernetes doesn’t order them in any particular way, but you can specify terms and things in these queries as you improve your knowledge and understanding.

4. How to Spot Problems Before They Affect Your Application

One of the benefits of Kubernetes is its ability to automatically return to a consistent state, but this can mask the effects of underlying problems for developers who do not know what to look for. For this reason, it is important that developers have access to Kubernetes logs and telemetry. Metrics like CPUs or memory used become even more important for Kubernetes-based applications because they can reveal issues with applications that Kubernetes may have masked. Surfacing problems before they affect customers is crucial to maintaining positive application experiences.

5. When to Investigate Problems

At its core, Kubernetes is about keeping applications running for production, so there is no direct output that explains why an app crashed or compels developers to take an active role in remediation. In most cases, Kubernetes just restarts the containers, and the problem is resolved. While it’s not an immediate issue as a result of the self-healing and resilient nature of Kubernetes and containers, it’s worth noting that this can mask or hide bad application behavior, which may cause it to go unnoticed for longer.

There are times when it is worth a deeper investigation into an issue, like if it’s affecting customer experience or causing negative downstream effects to other teams. The type of service being provided and the impact that performance issues have on customers can also come into play. For instance, requirements for applications supporting defense contractors will differ dramatically from those for a grocery app.

You Can Skip Deep Understanding of the Process

Although it’s essential for developers to understand how their applications run on Kubernetes, in most cases they don’t need a deep understanding of how Kubernetes works under the hood. As a developer, Kubernetes may seem like chaos to some degree, but you can control the chaos by providing specific directions to follow. For example, if you don’t specify where you want your pod to run, it will run anywhere that applications are allowed to run by default. Or, you can use taints/tolerations to tell Kubernetes not to run applications on specific nodes. However you configure (or don’t configure) this, you can use basic automated checks to show if the app is healthy.

When getting started with Kubernetes, don’t overcomplicate it. Look at it as a way of managing your applications with a system that automatically takes action to repair itself and return to a healthy state with little intervention. The basics outlined in this post will provide a great starting point for developers to build a foundation and focus their education. As their knowledge and experience grows, they can expand their implementation of Kubernetes to be as specific and robust as needed.

Table of Contents

    Share Article

    RSS Feed

    Next blog post
    You're viewing our latest blog post.
    Previous blog post
    You're viewing our oldest blog post.
    Mezmo + Catchpoint deliver observability SREs can rely on
    Mezmo’s AI-powered Site Reliability Engineering (SRE) agent for Root Cause Analysis (RCA)
    What is Active Telemetry
    Launching an agentic SRE for root cause analysis
    Paving the way for a new era: Mezmo's Active Telemetry
    The Answer to SRE Agent Failures: Context Engineering
    Empowering an MCP server with a telemetry pipeline
    The Debugging Bottleneck: A Manual Log-Sifting Expedition
    The Smartest Member of Your Developer Ecosystem: Introducing the Mezmo MCP Server
    Your New AI Assistant for a Smarter Workflow
    The Observability Problem Isn't Data Volume Anymore—It's Context
    Beyond the Pipeline: Data Isn't Oil, It's Power.
    The Platform Engineer's Playbook: Mastering OpenTelemetry & Compliance with Mezmo and Dynatrace
    From Alert to Answer in Seconds: Accelerating Incident Response in Dynatrace
    Taming Your Dynatrace Bill: How to Cut Observability Costs, Not Visibility
    Architecting for Value: A Playbook for Sustainable Observability
    How to Cut Observability Costs with Synthetic Monitoring and Responsive Pipelines
    Unlock Deeper Insights: Introducing GitLab Event Integration with Mezmo
    Introducing the New Mezmo Product Homepage
    The Inconvenient Truth About AI Ethics in Observability
    Observability's Moneyball Moment: How AI Is Changing the Game (Not Ending It)
    Do you Grok It?
    Top Five Reasons Telemetry Pipelines Should Be on Every Engineer’s Radar
    Is It a Cup or a Pot? Helping You Pinpoint the Problem—and Sleep Through the Night
    Smarter Telemetry Pipelines: The Key to Cutting Datadog Costs and Observability Chaos
    Why Datadog Falls Short for Log Management and What to Do Instead
    Telemetry for Modern Apps: Reducing MTTR with Smarter Signals
    Transforming Observability: Simpler, Smarter, and More Affordable Data Control
    Datadog: The Good, The Bad, The Costly
    Mezmo Recognized with 25 G2 Awards for Spring 2025
    Reducing Telemetry Toil with Rapid Pipelining
    Cut Costs, Not Insights:   A Practical Guide to Telemetry Data Optimization
    Webinar Recap: Telemetry Pipeline 101
    Petabyte Scale, Gigabyte Costs: Mezmo’s Evolution from ElasticSearch to Quickwit
    2024 Recap - Highlights of Mezmo’s product enhancements
    My Favorite Observability and DevOps Articles of 2024
    AWS re:Invent ‘24: Generative AI Observability, Platform Engineering, and 99.9995% Availability
    From Gartner IOCS 2024 Conference: AI, Observability Data, and Telemetry Pipelines
    Our team’s learnings from Kubecon: Use Exemplars, Configuring OTel, and OTTL cookbook
    How Mezmo Uses a Telemetry Pipeline to Handle Metrics, Part II
    Webinar Recap: 2024 DORA Report: Accelerate State of DevOps
    Kubecon ‘24 recap: Patent Trolls, OTel Lessons at Scale, and Principle Platform Abstractions
    Announcing Mezmo Flow: Build a Telemetry Pipeline in 15 minutes
    Key Takeaways from the 2024 DORA Report
    Webinar Recap | Telemetry Data Management: Tales from the Trenches
    What are SLOs/SLIs/SLAs?
    Webinar Recap | Next Gen Log Management: Maximize Log Value with Telemetry Pipelines
    Creating In-Stream Alerts for Telemetry Data
    Creating Re-Usable Components for Telemetry Pipelines
    Optimizing Data for Service Management Objective Monitoring
    More Value From Your Logs: Next Generation Log Management from Mezmo
    A Day in the Life of a Mezmo SRE
    Webinar Recap: Applying a Data Engineering Approach to Telemetry Data
    Dogfooding at Mezmo: How we used telemetry pipeline to reduce data volume
    Unlocking Business Insights with Telemetry Pipelines
    Why Your Telemetry (Observability) Pipelines Need to be Responsive
    How Data Profiling Can Reduce Burnout
    Data Optimization Technique: Route Data to Specialized Processing Chains
    Data Privacy Takeaways from Gartner Security & Risk Summit
    Mastering Telemetry Pipelines: Driving Compliance and Data Optimization
    A Recap of Gartner Security and Risk Summit: GenAI, Augmented Cybersecurity, Burnout
    Why Telemetry Pipelines Should Be A Part Of Your Compliance Strategy
    Pipeline Module: Event to Metric
    Telemetry Data Compliance Module
    OpenTelemetry: The Key To Unified Telemetry Data
    Data optimization technique: convert events to metrics
    What’s New With Mezmo: In-stream Alerting
    How Mezmo Used Telemetry Pipeline to Handle Metrics
    Webinar Recap: Mastering Telemetry Pipelines - A DevOps Lifecycle Approach to Data Management
    Open-source Telemetry Pipelines: An Overview
    SRECon Recap: Product Reliability, Burn Out, and more
    Webinar Recap: How to Manage Telemetry Data with Confidence
    Webinar Recap: Myths and Realities in Telemetry Data Handling
    Using Vector to Build a Telemetry Pipeline Solution
    Managing Telemetry Data Overflow in Kubernetes with Resource Quotas and Limits
    How To Optimize Telemetry Pipelines For Better Observability and Security
    Gartner IOCS Conference Recap: Monitoring and Observing Environments with Telemetry Pipelines
    AWS re:Invent 2023 highlights: Observability at Stripe, Capital One, and McDonald’s
    Webinar Recap: Best Practices for Observability Pipelines
    Introducing Responsive Pipelines from Mezmo
    My First KubeCon - Tales of the K8’s community, DE&I, sustainability, and OTel
    Modernize Telemetry Pipeline Management with Mezmo Pipeline as Code
    How To Profile and Optimize Telemetry Data: A Deep Dive
    Kubernetes Telemetry Data Optimization in Five Steps with Mezmo
    Introducing Mezmo Edge: A Secure Approach To Telemetry Data
    Understand Kubernetes Telemetry Data Immediately With Mezmo’s Welcome Pipeline
    Unearthing Gold: Deriving Metrics from Logs with Mezmo Telemetry Pipeline
    Webinar Recap: The Single Pane of Glass Myth
    Empower Observability Engineers: Enhance Engineering With Mezmo
    Webinar Recap: How to Get More Out of Your Log Data
    Unraveling the Log Data Explosion: New Market Research Shows Trends and Challenges
    Webinar Recap: Unlocking the Full Value of Telemetry Data
    Data-Driven Decision Making: Leveraging Metrics and Logs-to-Metrics Processors
    How To Configure The Mezmo Telemetry Pipeline
    Supercharge Elasticsearch Observability With Telemetry Pipelines
    Enhancing Grafana Observability With Telemetry Pipelines
    Optimizing Your Splunk Experience with Telemetry Pipelines
    Webinar Recap: Unlocking Business Performance with Telemetry Data
    Enhancing Datadog Observability with Telemetry Pipelines
    Transforming Your Data With Telemetry Pipelines