Prometheus Operator — How to monitor an external service

Published in

DevOps College

7 min readApr 22, 2018

Click here to share this article on LinkedIn »

In this hands-on guide we will look at how to deploy Prometheus Operator into a Kubernetes cluster and how to add an external service to Prometheus` targets list.
During my last project, we decided to use Prometheus Operator as our monitoring and alerting tool. Our app is living in a Kubernetes cluster but in addition to that we own an external service — A GPU machine.
Kubernetes is not aware of this service at all, the relevant services connecting to this service with an HTTP requests. I want to share with you my experience with Prometheus Operator and how to custom it to watch over any external service.

What is Prometheus

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company.

Prometheus has become the standard tool for monitoring and alerting in Kubernetes and Docker world. It provides by far the most detailed and actionable metrics and analysis. In the latest major release of 2.0 version, the performance of Prometheus improved significantly and now Prometheus performs well under heavy loads and bursts. In addition to that, you get all of the benefits of a world-leading open source project. Prometheus is free at the point of use and covers many use cases with ease.

Prometheus Operator

In late 2016, CoreOS introduced the Operator pattern and released the Prometheus Operator as a working example of the pattern. The Prometheus Operator automatically creates and manages Prometheus monitoring instances.

The mission of the Prometheus Operator is to make running Prometheus on top of Kubernetes as easy as possible, while preserving configurability as well as making the configuration Kubernetes native. https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html

The Prometheus Operator purpose is to make our life easier — deployment and maintenance.

How does it work?

In order to understand the problem we need first to learn how does Prometheus Operator work.

Prometheus Operator Architecture. Source: prometheus-operator

After we successfully deployed Prometheus Operator we should see a new CRDs (Custom Resource Definition):

Prometheus, which defines a desired Prometheus deployment.
ServiceMonitor, which declaratively specifies how groups of services should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.
Alertmanager, which defines a desired Alertmanager deployment.

When a new version for your service is getting update a new pod is created. Prometheus is watching over k8s API so when it detects this kind of changes it will create a new set of configuration for this new service (pod).

ServiceMonitor

Prometheus Operator uses a CRD, named ServiceMonitor, to abstract the configuration to target.
Here is an example for a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1alpha1
kind: ServiceMonitor
metadata:
  name: frontend
  labels:
    tier: frontend
spec:
  selector:
    matchLabels:
      tier: frontend
  endpoints:
  - port: web            # works for different port numbers as long as the name matches
    interval: 10s        # scrape the endpoint every 10 seconds

This merely defines how a set of services should be monitored. We now need define Prometheus instance that includes this ServiceMonitor into its configuration:

apiVersion: monitoring.coreos.com/v1alpha1
kind: Prometheus
metadata:
  name: prometheus-frontend
  labels:
    prometheus: frontend
spec:
  version: v1.3.0
  # Define that all ServiceMonitor TPRs with the label `tier = frontend` should be included
  # into the server's configuration.
  serviceMonitors:
  - selector:
      matchLabels:
        tier: frontend

Now Prometheus will monitor each service with the label tier: frontend.

The Problem

As I describe, we want to monitor over an external service, on this GPU machine I launched a node-exporter:

docker run -d -p 9100:9100 node-exporter

We want to send this node-exporter’s metrics to our Prometheus.
How can we create ServiceMonitor for a service that don't have a Pod nor a Service ?
In order to solve this I decided to dig deeper on how Kubernetes handle this Service Pod relationship.
At the Services page in Kubernetes official documents I found this:

For Kubernetes-native applications, Kubernetes offers a simple Endpoints API that is updated whenever the set of Pods in a Service changes. For non-native applications, Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backend Pods.

That’s the solution I was looking for! I need to create a custom EndPoint which will define my external service together with matching Service and the final piece ServiceMonitor definition so Prometheus will add it to his targets list.

Installing Prometheus Operator

Prerequisite:

Basic knowledge in Kubernetes commands and components
Own a working Kubernetes cluster
Helm deployed

We are ready to get our hands dirty…

idob ~(☸|kube.prometheus:default):
▶ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/idob ~(☸|kube.prometheus:default): 
▶ helm install coreos/prometheus-operator --name prometheus-operator --namespace monitoring

OK, so far we installed into our cluster the TPRs of Prometheus Operator.
Now let’s actually deploy Prometheus, Alertmanager and Grafana.
TIP: When I work with a large Helm charts I prefer to create a separate value.yaml files that will contain all of my custom changes. This will make an easier life for me and my college for later changes and modifications.

idob ~(☸|kube.prometheus:default):
▶ helm install coreos/kube-prometheus --name kube-prometheus   \
       -f my_changes/prometheus.yaml                           \
       -f my_changes/grafana.yaml                              \
       -f my_changes/alertmanager.yaml

That’s it, easy right?
To check everything is working you should see something like this:

idob ~(☸|kube.prometheus:default): 
▶ k -n monitoring get po
NAME                                                   READY     STATUS    RESTARTS   AGE
alertmanager-kube-prometheus-0                         2/2       Running   0          1h
kube-prometheus-exporter-kube-state-68dbb4f7c9-tr6rp   2/2       Running   0          1h
kube-prometheus-exporter-node-bqcj4                    1/1       Running   0          1h
kube-prometheus-exporter-node-jmcq2                    1/1       Running   0          1h
kube-prometheus-exporter-node-qnzsn                    1/1       Running   0          1h
kube-prometheus-exporter-node-v4wn8                    1/1       Running   0          1h
kube-prometheus-exporter-node-x5226                    1/1       Running   0          1h
kube-prometheus-exporter-node-z996c                    1/1       Running   0          1h
kube-prometheus-grafana-54c96ffc77-tjl6g               2/2       Running   0          1h
prometheus-kube-prometheus-0                           2/2       Running   0          1h
prometheus-operator-1591343780-5vb5q                   1/1       Running   0          1h

Let’s visit Prometheus UI and take a look at the Targets page:

idob ~(☸|kube.prometheus:default):
▶ k -n monitoring port-forward prometheus-kube-prometheus-0 9090
Forwarding from 127.0.0.1:9090 -> 9090

in the browser:

We can see a bunch of targets that was already define by default, our goal is to add our new GPU target.
We need to find out which label the current Prometheus is looking for and use it. (we could create a new Prometheus instance and configure it to search our label only but I think it is over head for just one more target)

idob ~(☸|kube.prometheus:default): 
▶ k -n monitoring get prometheus kube-prometheus -o yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    app: prometheus
    chart: prometheus-0.0.14
    heritage: Tiller
    prometheus: kube-prometheus
    release: kube-prometheus
  name: kube-prometheus
  namespace: monitoring
spec:
  ...
  baseImage: quay.io/prometheus/prometheus
  serviceMonitorSelector:
    matchLabels:
      prometheus: kube-prometheus     # <--- BOOM
  ....

All set, we are ready to create the necessary resources for our target.

EndPoint

apiVersion: v1
kind: Endpoints
metadata:
    name: gpu-metrics
    labels:
        k8s-app: gpu-metrics
subsets:
    - addresses:
    - ip: <gpu-machine-ip>
ports:
    - name: metrics
      port: 9100
      protocol: TCP

As we decided, we are creating our very own static EndPoint , we gave to it the machine IP, port, and the label k8s-app: gpu-exporter which will describe our GPU service only .

Service

apiVersion: v1
kind: Service
metadata:
    name: gpu-metrics-svc
    namespace: monitoring
    labels:
        k8s-app: gpu-metrics
spec:
    type: ExternalName
    externalName: <gpu-machine-ip>
    clusterIP: ""
    ports:
    - name: metrics
      port: 9100
      protocol: TCP
      targetPort: 9100

ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
    name: gpu-metrics-sm
    labels:
        k8s-app: gpu-metrics
        prometheus: kube-prometheus
spec:
    selector:
        matchLabels:
            k8s-app: gpu-metrics
        namespaceSelector:
            matchNames:
            - monitoring
    endpoints:
    - port: metrics
      interval: 10s
      honorLabels: true

The important part here is the labels — we must assign the label prometheus: kube-prometheus so the Prometheus server will look for this target and the second label in the matchLabels section so the ServiceMonitor will point to our gpu-exporter only.
Let’s apply all:

idob ~(☸|kube.prometheus:default): 
▶ k apply -f gpu-exporter-ep.yaml  \
          -f gpu-exporter-svc.yaml \
          -f gpu-exporter-sm.yaml

Now had over to Prometheus UI, and if we look at the Targets page we should see our GPU in the list:

That’s it . As you can see it is fairly easy to deploy Prometheus Operator and now I hope it’s easy to monitor all your services even if they are exist outside from your Kubernetes cluster. Form my experience Prometheus Operator working flawless, I am highly recommend to use it.
I hope you have enjoyed it. Please do not hesitate to give feedback and share your own lessons.

Prometheus Operator — How to monitor an external service

What is Prometheus

Prometheus Operator

How does it work?

ServiceMonitor

The Problem

Installing Prometheus Operator

EndPoint

Service

ServiceMonitor

Written by Ido Braunstain