[Containers] Kubernetes Infrastructure Monitoring with Datadog [Monitoring]

This is Ohara from the Technical Sales Department

This article discusses infrastructure monitoring in a Kubernetes environment using Datadog's monitoring tools. It
addresses the question, "Why is an integrated monitoring tool necessary for a dynamic and variable infrastructure environment like Kubernetes?" and outlines its features and key points.
(Information is current as of March 2022.)

Challenges of infrastructure monitoring for Kubernetes environments

The adoption of orchestration systems like Kubernetes is increasing to improve infrastructure scalability and fault tolerance. However, unlike traditional cases where only static hosts used for a long time, such as virtual machines or physical machines, dynamic and complex embedded environments like Kubernetes require monitoring using an integrated monitoring tool like Datadog that can provide real-time visibility into hosts, containers, applications, and the entire Kubernetes environment

● Increased number of components to monitor
: In traditional host-centric infrastructure, the main layers to monitor are the "application" and the "host running the application." In orchestration environments, a new abstraction layer is added, and to comprehensively track the infrastructure, it is necessary to monitor containers and Kubernetes itself.

● Distributed applications are constantly moving
. Kubernetes constantly moves Pods between hosts, scaling up and down to meet demand. To properly understand your applications and their content, you need to monitor all Pods and the applications running within them. However, because Kubernetes automatically schedules workloads, it's difficult to continuously check where these Pods are actually running.

● Tags and labels are essential for continuous visibility
. Because typical Kubernetes clusters have many dynamic/mutable elements, tags and labels are the only reliable way to identify Pods and the applications within them. Without labels and tags, it would be nearly impossible to aggregate or interpret performance data from a constantly changing Kubernetes infrastructure.

Monitor Kubernetes platform environments at any scale

Kubernetes clusters run on a variety of platforms, and Datadog's 400+ pre-built integrations for all major cloud providers let you monitor the health and performance of all your containerized applications as they come online, regardless of the platform they're using behind the scenes

And whether your organization chooses a fully managed platform or hosts with Rancher, OpenShift, or Anthos, Datadog brings all of your Kubernetes infrastructure and application data together in a single, unified platform—from cluster status and low-level resource metrics to distributed traces and logs

Datadog automatically enriches your data with tags from Kubernetes, Docker, and cloud providers, making it easy to investigate events as they occur. Whether you're running dozens or thousands of nodes, Datadog provides deep visibility into your Kubernetes clusters with minimal setup, enabling you to safely build, deploy, and scale your container environments

All your Kubernetes data in one place

Datadog provides visibility into what's happening at every layer of your Kubernetes environment. Using a DaemonSet or the Datadog Operator, you can easily deploy the Datadog Agent to every node in your cluster. With Datadog's Kubernetes integration, you can:

◆ Maintaining a healthy control plane

● Tracking each part of the control plane
: Monitor the health and performance of all components of the control plane, including the scheduler, API server, and controller manager. Maintaining a healthy control plane allows for proper scheduling and orchestration of workloads, ensuring the cluster runs smoothly.

● Set up automatic alerts
and detect and resolve critical control plane issues, such as an unusual surge in HTTP response codes other than 200, before they affect customers.

Troubleshooting Kubernetes issues

● Full-stack visibility of your Kubernetes environment
: Seamlessly navigate between Kubernetes workload and application metrics, logs, and distributed traces to quickly troubleshoot performance issues. Visualize data in real time with customizable, easy-to-use dashboards.

● Analyze Kubernetes audit logs
and troubleshoot API authentication issues that may affect access to the cluster from users and services.

● Drill-down with tags
: Datadog automatically collects tags from Kubernetes/Docker/cloud providers, making it easy to sort, filter, and aggregate data. You can quickly narrow down the scope of an issue by region, container image, Pod name, or other categories, reducing average resolution time.

◆ Automatically detects service status anywhere

● Dynamically monitor orchestrated services
: Datadog detects changes within the cluster and automatically starts collecting data from various cluster components (such as the Kubernetes API server) and common infrastructure technologies (such as Apache Tomcat and Redis) without user setup. You can also define custom configuration templates for agent checks and specify which containers should be monitored for each check.

◆ Auto-scale workloads using any metric

● Delivering high-quality customer experiences even in large-scale environments
: Using Datadog with Kubernetes' Horizontal Pod Autoscaler helps maintain application availability even during unexpected traffic spikes. You can scale your workloads based on any metric you monitor with Datadog, from integration-specific metrics (such as MySQL query throughput) to custom business metrics (e.g., daily page views).

summary

In a dynamic and variable infrastructure environment like Kubernetes, it can be difficult to operate using traditional monitoring tools, so we recommend introducing an integrated monitoring tool like Datadog to improve operational performance

Although this article was written from the perspective and concept of a Kubernetes container environment, some of the content also applies to the operation of the Auto Scaling function and environment for instances such as Amazon EC2 using cloud environments such as AWS, so we hope you will find this article useful

If you found this article helpful,please give it a "Like"!
3
Loading...
3 votes, average: 1.00 / 13
2,491
X Facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Ohara

He started his career in the telecommunications industry as a salesperson responsible for the implementation of IT products such as corporate network services, office equipment, and groupware

He then worked at a system integrator-affiliated data center company as a pre-sales engineer for physical servers and hosting services, and as a customer engineer for SaaS-based SFA/CRM and B2B e-commerce, before joining Beyond, where he currently works

I am currently stationed in China (Shenzhen) and my daily routine is watching Chinese dramas and Billbill

Qualifications: Bookkeeping Level 2