Prometheus Operator — How to monitor an external service
Click here to share this article on LinkedIn »
In this hands-on guide we will look at how to deploy Prometheus Operator into a Kubernetes cluster and how to add an external service to Prometheus` targets list.
During my last project, we decided to use Prometheus Operator as our monitoring and alerting tool. Our app is living in a Kubernetes cluster but in addition to that we own an external service — A GPU machine.
Kubernetes is not aware of this service at all, the relevant services connecting to this service with an HTTP requests. I want to share with you my experience with Prometheus Operator and how to custom it to watch over any external service.
What is Prometheus
Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company.
Prometheus has become the standard tool for monitoring and alerting in Kubernetes and Docker world. It provides by far the most detailed and actionable metrics and analysis. In the latest major release of 2.0 version, the performance of Prometheus improved significantly and now Prometheus performs well under heavy loads and bursts. In addition to that, you get all of the benefits of a world-leading open source project. Prometheus is free at the point of use and covers many use cases with ease.
Prometheus Operator
In late 2016, CoreOS introduced the Operator pattern and released the Prometheus Operator as a working example of the pattern. The Prometheus Operator automatically creates and manages Prometheus monitoring instances.
The mission of the Prometheus Operator is to make running Prometheus on top of Kubernetes as easy as possible, while preserving configurability as well as making the configuration Kubernetes native. https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html
The Prometheus Operator purpose is to make our life easier — deployment and maintenance.
How does it work?
In order to understand the problem we need first to learn how does Prometheus Operator work.
After we successfully deployed Prometheus Operator we should see a new CRDs (Custom Resource Definition):
Prometheus
, which defines a desired Prometheus deployment.ServiceMonitor
, which declaratively specifies how groups of services should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.Alertmanager
, which defines a desired Alertmanager deployment.
When a new version for your service is getting update a new pod is created. Prometheus is watching over k8s API so when it detects this kind of changes it will create a new set of configuration for this new service (pod).
ServiceMonitor
Prometheus Operator uses a CRD, named ServiceMonitor, to abstract the configuration to target.
Here is an example for a ServiceMonitor
:
apiVersion: monitoring.coreos.com/v1alpha1
kind: ServiceMonitor
metadata:
name: frontend
labels:
tier: frontend
spec:
selector:
matchLabels:
tier: frontend
endpoints:
- port: web # works for different port numbers as long as the name matches
interval: 10s # scrape the endpoint every 10 seconds
This merely defines how a set of services should be monitored. We now need define Prometheus instance that includes this ServiceMonitor
into its configuration:
apiVersion: monitoring.coreos.com/v1alpha1
kind: Prometheus
metadata:
name: prometheus-frontend
labels:
prometheus: frontend
spec:
version: v1.3.0
# Define that all ServiceMonitor TPRs with the label `tier = frontend` should be included
# into the server's configuration.
serviceMonitors:
- selector:
matchLabels:
tier: frontend
Now Prometheus will monitor each service with the label tier: frontend
.
The Problem
As I describe, we want to monitor over an external service, on this GPU machine I launched a node-exporter:
docker run -d -p 9100:9100 node-exporter
We want to send this node-exporter’s metrics to our Prometheus.
How can we create ServiceMonitor
for a service that don't have a Pod
nor a Service
?
In order to solve this I decided to dig deeper on how Kubernetes handle this Service
Pod
relationship.
At the Services page in Kubernetes official documents I found this:
For Kubernetes-native applications, Kubernetes offers a simple
Endpoints
API that is updated whenever the set ofPods
in aService
changes. For non-native applications, Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backendPods
.
That’s the solution I was looking for! I need to create a custom EndPoint
which will define my external service together with matching Service
and the final piece ServiceMonitor
definition so Prometheus will add it to his targets list.
Installing Prometheus Operator
Prerequisite:
- Basic knowledge in Kubernetes commands and components
- Own a working Kubernetes cluster
- Helm deployed
We are ready to get our hands dirty…
idob ~(☸|kube.prometheus:default):
▶ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/idob ~(☸|kube.prometheus:default):
▶ helm install coreos/prometheus-operator --name prometheus-operator --namespace monitoring
OK, so far we installed into our cluster the TPRs of Prometheus Operator.
Now let’s actually deploy Prometheus, Alertmanager and Grafana.
TIP: When I work with a large Helm charts I prefer to create a separate value.yaml
files that will contain all of my custom changes. This will make an easier life for me and my college for later changes and modifications.
idob ~(☸|kube.prometheus:default):
▶ helm install coreos/kube-prometheus --name kube-prometheus \
-f my_changes/prometheus.yaml \
-f my_changes/grafana.yaml \
-f my_changes/alertmanager.yaml
That’s it, easy right?
To check everything is working you should see something like this:
idob ~(☸|kube.prometheus:default):
▶ k -n monitoring get po
NAME READY STATUS RESTARTS AGE
alertmanager-kube-prometheus-0 2/2 Running 0 1h
kube-prometheus-exporter-kube-state-68dbb4f7c9-tr6rp 2/2 Running 0 1h
kube-prometheus-exporter-node-bqcj4 1/1 Running 0 1h
kube-prometheus-exporter-node-jmcq2 1/1 Running 0 1h
kube-prometheus-exporter-node-qnzsn 1/1 Running 0 1h
kube-prometheus-exporter-node-v4wn8 1/1 Running 0 1h
kube-prometheus-exporter-node-x5226 1/1 Running 0 1h
kube-prometheus-exporter-node-z996c 1/1 Running 0 1h
kube-prometheus-grafana-54c96ffc77-tjl6g 2/2 Running 0 1h
prometheus-kube-prometheus-0 2/2 Running 0 1h
prometheus-operator-1591343780-5vb5q 1/1 Running 0 1h
Let’s visit Prometheus UI and take a look at the Targets page:
idob ~(☸|kube.prometheus:default):
▶ k -n monitoring port-forward prometheus-kube-prometheus-0 9090
Forwarding from 127.0.0.1:9090 -> 9090
in the browser:
We can see a bunch of targets that was already define by default, our goal is to add our new GPU target.
We need to find out which label the current Prometheus is looking for and use it. (we could create a new Prometheus instance and configure it to search our label only but I think it is over head for just one more target)
idob ~(☸|kube.prometheus:default):
▶ k -n monitoring get prometheus kube-prometheus -o yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
app: prometheus
chart: prometheus-0.0.14
heritage: Tiller
prometheus: kube-prometheus
release: kube-prometheus
name: kube-prometheus
namespace: monitoring
spec:
...
baseImage: quay.io/prometheus/prometheus
serviceMonitorSelector:
matchLabels:
prometheus: kube-prometheus # <--- BOOM
....
All set, we are ready to create the necessary resources for our target.
EndPoint
apiVersion: v1
kind: Endpoints
metadata:
name: gpu-metrics
labels:
k8s-app: gpu-metrics
subsets:
- addresses:
- ip: <gpu-machine-ip>
ports:
- name: metrics
port: 9100
protocol: TCP
As we decided, we are creating our very own static EndPoint
, we gave to it the machine IP, port, and the label k8s-app: gpu-exporter
which will describe our GPU service only .
Service
apiVersion: v1
kind: Service
metadata:
name: gpu-metrics-svc
namespace: monitoring
labels:
k8s-app: gpu-metrics
spec:
type: ExternalName
externalName: <gpu-machine-ip>
clusterIP: ""
ports:
- name: metrics
port: 9100
protocol: TCP
targetPort: 9100
ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: gpu-metrics-sm
labels:
k8s-app: gpu-metrics
prometheus: kube-prometheus
spec:
selector:
matchLabels:
k8s-app: gpu-metrics
namespaceSelector:
matchNames:
- monitoring
endpoints:
- port: metrics
interval: 10s
honorLabels: true
The important part here is the labels — we must assign the label prometheus: kube-prometheus
so the Prometheus server will look for this target and the second label in the matchLabels
section so the ServiceMonitor
will point to our gpu-exporter
only.
Let’s apply all:
idob ~(☸|kube.prometheus:default):
▶ k apply -f gpu-exporter-ep.yaml \
-f gpu-exporter-svc.yaml \
-f gpu-exporter-sm.yaml
Now had over to Prometheus UI, and if we look at the Targets page we should see our GPU in the list:
That’s it . As you can see it is fairly easy to deploy Prometheus Operator and now I hope it’s easy to monitor all your services even if they are exist outside from your Kubernetes cluster. Form my experience Prometheus Operator working flawless, I am highly recommend to use it.
I hope you have enjoyed it. Please do not hesitate to give feedback and share your own lessons.