Deploying Prometheus and Grafana for Observability on a Minikube Cluster Using DaemonSet

VAIBHAV HARIRAMANI
10 min readJun 6, 2024

--

Introduction

In today’s rapidly evolving software development landscape, maintaining observability is critical. Observability helps ensure that your applications and services are performing optimally and provides insights into issues that arise. In this blog, we will explore how to deploy Prometheus and Grafana on a Minikube cluster using DaemonSets, providing a robust monitoring and alerting system for your Kubernetes environment.

Prometheus is an open-source systems monitoring and alerting toolkit, while Grafana is an open-source platform for monitoring and observability that integrates with Prometheus for data visualization. Deploying these tools in a Kubernetes cluster allows for comprehensive monitoring of the cluster’s health and performance.

Prerequisites

Before we dive into the deployment process, ensure you have the following prerequisites:

  1. A running Kubernetes cluster (Minikube in this case).
  2. Kubectl installed and configured to interact with your Kubernetes cluster.
  3. Helm, the package manager for Kubernetes, installed.

Step-by-Step Deployment

Step 1: Create a Namespace

Namespaces in Kubernetes provide a way to divide cluster resources between multiple users. We’ll create a namespace for our monitoring tools.

kubectl create namespace monitoring

Step 2: Prometheus Configuration

Create a configuration file for Prometheus. This configuration will define how Prometheus scrapes metrics from the nodes in the cluster.

Create a file named prometheus-config.yaml with the following content:

apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

Apply the configuration:

kubectl apply -f prometheus-config.yaml -n monitoring

Step 3: Prometheus Deployment

Next, create a deployment file for Prometheus. This deployment will use the ConfigMap we just created.

Create a file named prometheus-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-server
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus
ports:
- containerPort: 9090
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus
volumes:
- name: config-volume
configMap:
name: prometheus-server-conf
defaultMode: 420

Apply the deployment:

kubectl apply -f prometheus-deployment.yaml -n monitoring

Step 4: Expose Prometheus as a Service

To access Prometheus, we need to expose it as a service.

Create a file named prometheus-service.yaml:

apiVersion: v1
kind: Service
metadata:
name: prometheus-service
namespace: monitoring
spec:
selector:
app: prometheus-server
ports:
- protocol: TCP
port: 80
targetPort: 9090
type: LoadBalancer

Apply the service:

kubectl apply -f prometheus-service.yaml -n monitoring

Step 5: Install Grafana

We’ll use Helm to install Grafana. First, add the Grafana Helm repository and update it.

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Install Grafana in the monitoring namespace:

helm install grafana grafana/grafana --namespace monitoring

Check the status of the pods to ensure Grafana is running:

kubectl get pods -n monitoring

Step 6: Access Grafana

Having successfully installed Grafana on the Kubernetes Cluster, the Grafana server is now accessible through port 80. To retrieve the complete list of Kubernetes Services associated with Grafana, execute the following command:

kubectl get service

Output:

kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana ClusterIP 10.104.22.18 <none> 80/TCP 4m6s

Exposing the Grafana Kubernetes Service

To expose the Grafana-service on Kubernetes we need to run this command

kubectl expose service grafana --type=NodePort --target-port=3000 --name=grafana-ext

By executing this command, we transition the service type from ClusterIP to NodePort, enabling external access to Grafana beyond the Kubernetes Cluster. The service will now be reachable on port 3000.

This below command will generate the following URL to access the Grafana dashboard

minikube service grafana-ext

Grafana UI:

Step 7: Configure Grafana to Use Prometheus

  • Go back to the Grafana home, and go to the dashboard on left corner
  • Now click on the new->import
  • Add the Grafana Id, and click on ‘Load’
  • Select a Prometheus Data Source and Click Import
  • It will the launch the Dashboard shown below:

You use this dashboard to monitor and observe the Kubernetes cluster metrics. It displays the following Kubernetes cluster metrics:

  • Network I/O pressure.
  • Cluster CPU usage.
  • Cluster Memory usage.
  • Cluster filesystem usage.
  • Pods CPU usage.

Once logged in, add Prometheus as a data source in Grafana:

  1. Go to the Grafana dashboard.
  2. Click on the gear icon (Configuration) and select “Data Sources”.
  3. Click “Add data source” and select “Prometheus”.
  1. Set the URL to http://prometheus-service:80.
  2. Click “Save & Test”.

Conclusion

By following these steps, you have successfully deployed Prometheus and Grafana on a Minikube cluster using DaemonSet. This setup provides a powerful observability stack that allows you to monitor your Kubernetes cluster’s health, performance, and resource utilization effectively.

What are Kubernetes metrics?

Kubernetes metrics proactively gain insight into your clusters, nodes, pods, and applications. They present a general overview of the Kubernetes clusters to DevOps and IT operations — making it easy to keep track of containers, detect errors, and fix them promptly. When utilized, metrics make it easy to manage your cluster’s health and performance reliably.

Why is Monitoring Kubernetes essential?

Kubernetes is a complicated system that handles several containers, nodes, and services, which may make detecting and troubleshooting issues challenging. Monitoring Kubernetes provides visibility into the health, performance, and availability of an application running on a Kubernetes cluster, thereby making it easier to:

  • Optimize resources and cost: Monitoring Kubernetes can help maximize resource consumption by detecting and reallocating underutilized resources. It also offers resource utilization data that can be used for cost estimation and chargeback. Hence, you can ensure that individual containers, pods, and even namespaces use the underlying resources well.
  • Increase troubleshooting and reliability: The complex nature of Kubernetes makes identifying the primary cause of faults tricky when they occur. Monitoring gives extensive insights into the behavior of your Kubernetes cluster, making it easier to identify and fix possible bottlenecks. You can also take preventive steps by observing the cluster’s performance metrics.
  • Improve security: By monitoring system activity and logs in Kubernetes, you can identify vulnerabilities and potential security issues before attackers can exploit them. Being able to track what jobs are running and where they are executing is critical in detecting unusual activity and signaling teams in cases of a security breach or DOS attack. Monitoring might not fix the errors in question, but it can alert you to respond fast and efficiently to limit the consequences of a security breach.

Kubernetes cluster metrics

To gain visibility into the health status of your Kubernetes cluster, you need to track various metrics like the number of running pods, containers, nodes, network bandwidth, memory, and CPU utilization. Keeping track of these metrics enables you to determine the total resources utilized by your cluster and whether or not your nodes are functioning optimally and at their full capacity.

Kubernetes pod metrics

Pod metrics are critical for understanding resource utilization and allocation to guarantee the pods and containers operate without causing performance issues for applications. It aids in determining whether or not an application should scale horizontally in response to a request. To gain insight on this, you should consider monitoring some pod-level metrics (Kubernetes, application-specific, and pod container metrics) to help detect and resolve any issues of over or under-provisioning.

Control plane metrics

The control plane is responsible for managing the configuration and state of the Kubernetes cluster. Control plane metrics are measurements and data gathered from a distributed system’s control plane components like API servers, schedulers, etcd, and control managers. To confirm that the control plane functions effectively, you should consider tracking key metrics such as availability, latency, throughput, etc.

Kubernetes nodes metrics

A Kubernetes node consists of a defined CPU and memory capacity that can be used by pods that are attached to it. It’s vital to include node-specific metrics to monitor system load, memory utilization, CPU, disk usage, network usage, and Node-network traffic. Monitoring Kubernetes node metrics can help identify and troubleshoot potential issues or performance bottlenecks on individual nodes. Hence, you can use available memory and disk space if you need to increase or decrease each node’s number and size.

Application metrics

There are situations where a pod could be running and operating as expected. However, the underlying binary/app running within the pod is not as stable as intended. Hence, you may consider monitoring the RED metrics (request rate, error rate, and duration) to help you evaluate the performance and availability of applications running in pods.

Which metrics impact the performance of Kubernetes clusters?

Here are some metrics that impact the performance of Kubernetes clusters:

  • CPU and memory utilization metrics: In a Kubernetes cluster, high resource utilization can lead to performance degradation and increased latency, while underutilization of resources can lead to missed opportunities for scaling and wasted resources. To maximize Kubernetes cluster performance, it is critical to regularly monitor resource use, CPU, and memory allocation metrics to determine if they are being utilized efficiently or if the resources allocated to the cluster need to be adjusted.
  • Networking metrics: Networking metrics play a crucial role in the performance of Kubernetes clusters as they directly affect how data is transmitted between nodes and pods. Kubernetes has several built-in network metrics, such as network traffic, errors, and packet loss. High latency, low throughput, and congestion are often indicators of network bottlenecks or other issues that result in slow application performance, poor user experience, and even system failures. Hence, monitoring key metrics like sent and received latency, throughput, and data packets is essential.
  • Pod deployment metrics: Pods have a number of health metrics, like memory, disk, and the number of running, pending, and failed pods. These metrics affect how well a Kubernetes cluster works. Pod deployment metrics provide information on how many instances a pod currently has and how many were expected. It considers parameters like replicas, pod status, pod restarts, rollout time, and resource use. If a pod frequently restarts or generates errors, it may be an indicator that there are issues with the application code, dependencies, or infrastructure. These problems can slow down application performance and negatively impact the user experience.
  • Application-specific metrics: When application metrics are not properly monitored or analyzed, it can lead to increased network latency, resource constraints, and potential downtime. For instance, if an application is experiencing high network traffic, it can result in increased network latency, which can lead to slow response times, affect the user experience, or consume network resources and result in resource constraints. Application-specific metrics are critical for providing insights into how the application interacts with the cluster components and can help identify performance bottlenecks. These metrics can provide insights into response times, error rates, and other key performance indicators that are specific to the application.

Depending on the applications running in the cluster, there may be additional metrics that are important to monitor. This could include things like database connections, error rate, queue size, and cache hit rate, measuring your application’s uptime and response times.

Photocredit: Kubernetes documentation

How is Kubernetes’ performance measured?

Measuring Kubernetes performance requires a combination of monitoring tools and metrics tailored to your specific applications and use cases. Many monitoring, logging, and alerting solutions are available for Kubernetes, such as Prometheus, Grafana, and Datadog. These technologies can assist you in collecting, analyzing, and visualizing data from your Kubernetes cluster. However, choosing the preferred solution depends on your application’s needs and expectations.

Prometheus collects and stores metrics, while Grafana provides a flexible and dynamic dashboard to visualize these metrics. Together, they form a robust monitoring and alerting system that can help you maintain the reliability and performance of your applications running on Kubernetes.

For more advanced configurations and custom dashboards, explore the extensive documentation provided by Prometheus and Grafana.

If you found this guide helpful, please share it with others and leave your comments or suggestions below!

In upcoming articles, I’ll delve into exciting topics like Ansible, Helm charts, and beyond. Your feedback and questions are invaluable, so feel free to share as I continue this learning adventure. Stay curious, and let’s keep building amazing things! 🚀

Thank You for reading

Please give 👏🏻 Claps if you like the blog.

Made with ❤️by Vaibhav Hariramani

Don’t forget to tag us

if you find this blog beneficial for you don’t forget to share it with your friends and mention us as well. And Don’t forget to share us on Linkedin, instagram, facebook , twitter, Github

More Resources

To learn more about these Resources you can Refer to some of these articles written by Me:-

Do Checkout My other Blogs

Do find time check out my other articles and further readings in the reference section. Kindly remember to follow me so as to get notified of my publications.

Do Checkout My Youtube channel

Follow me

on Linkedin, instagram, facebook , twitter, Github

Happy coding ❤️ .

--

--

VAIBHAV HARIRAMANI

Hi there! I am Vaibhav Hariramani a Travel & Tech blogger who love to seek out new technologies and experience cool stuff.