Monitoring with Prometheus
To enable a deeper understanding of the state of an Akri deployment and Node resource usage by Akri containers, Akri exposes metrics with Prometheus. This document will cover:
Installing Prometheus
Enabling Prometheus with Akri
Visualizing metrics with Grafana
Akri's currently exposed metrics
Exposing metrics from an Akri Broker Pod
Installing Prometheus
In order to expose Akri's metrics, Prometheus must be deployed to your cluster. If you already have Prometheus running on your cluster, you can skip this step.
Prometheus is comprised of many components. Instead of manually deploying all the components, the entire kube-prometheus stack can be deployed via its Helm chart. It includes the Prometheus operator, node exporter, built in Grafana support, and more.
Get the kube-prometheus stack Helm repo.
Install the chart, specifying what namespace you want Prometheus to run in. It does not have to be the same namespace in which you are running Akri. For example, it may be in a namespace called
monitoring
as in the command below. By default, Prometheus only discovers PodMonitors within its namespace. This should be disabled by settingpodMonitorSelectorNilUsesHelmValues
tofalse
so that Akri's custom PodMonitors can be discovered. Additionally, the Grafana service can be exposed to the host by making it a NodePort service. It may take a minute or so to deploy all the components.The Prometheus dashboard can also be exposed to the host by adding
--set prometheus.service.type=NodePort
. If intending to expose metrics from a Broker Pod via a ServiceMonitor also setserviceMonitorSelectorNilUsesHelmValues
tofalse
.
Enabling Prometheus in Akri
The Akri Controller and Agent publish metrics to port 8080 at a /metrics
endpoint. However, these cannot be accessed by Prometheus without creating PodMonitors, which are custom resources that tell Prometheus which Pods to monitor. These components can all be automatically created and deployed via Helm by setting --set prometheus.enabled=true
when installing Akri.
Install Akri and expose the Controller and Agent's metrics to Prometheus by running:
Visualizing metrics with Grafana
Now that Akri's metrics are being exposed to Prometheus, they can be visualized in Grafana.
Determine the port that the Grafana Service is running on, specifying the namespace if necessary, and save it for the next step.
SSH port forwarding can be used to access Grafana. Open a new terminal, and enter your ssh command to access the machine running Akri and Prometheus followed by the port forwarding request. The following command will use port 50000 on the host. Feel free to change it if it is not available. Be sure to replace
<Grafana Service port>
with the port number outputted in the previous step.Navigate to
http://localhost:50000/
and enter Grafana's default usernameadmin
and passwordprom-operator
.Once logged in, the username and password can be changed in account settings. Now,
you can create a Dashboard to display the Akri metrics.
Akri's currently exposed metrics
Akri uses the Rust Prometheus client library to expose metrics. It exposes all the default process metrics, such as Agent or Controller total CPU time usage (process_cpu_seconds_total
) and RAM usage (process_resident_memory_bytes
), along with the following custom metrics, all of which are prefixed with akri
.
Metric Name | Metric Type | Metric Source | Buckets |
---|---|---|---|
akri_instance_count | IntGaugeVec | Agent | Configuration, shared |
akri_discovery_response_result | IntCounterVec | Agent | Discovery Handler name, response result (Success/Fail) |
akri_discovery_response_time | HistogramVec | Agent | Configuration |
akri_broker_pod_count | IntGaugeVec | Controller | Configuration, Node |
Exposing metrics from an Akri Broker Pod
Metrics can also be published by Broker Pods and exposed to Prometheus. This workflow is not unique to Akri and is equivalent to exposing metrics from any deployment to Prometheus. Using the appropriate Prometheus client library for your broker, expose some metrics. Then, deploy a Service to expose the metrics, specifying the name of the associated Akri Configuration as a selector (akri.sh/configuration: <Akri Configuration>
), since the Configuration name is added as a label to all the Broker Pods by the Akri Controller. Finally, deploy a ServiceMonitor that selects for the previously mentioned service. This tells Prometheus which service(s) to discover.
Example: Exposing metrics from the udev video sample Broker
As an example, an akri_frame_count
metric has been created in the sample udev-video-broker. Like the Agent and Controller, it publishes both the default process metrics and the custom akri_frame_count
metric to port 8080 at a /metrics
endpoint.
Akri can be installed with the udev Configuration, filtering for only usb video cameras and specifying a
Configuration name of
akri-udev-video
, by running:Note: To expose the Agent and Controller's Prometheus metrics, add
--set prometheus.enabled=true
.Note: If Prometheus is running in a different namespace as Akri and was not enabled to discover ServiceMonitors in other namespaces when installed, upgrade your Prometheus Helm installation to set
prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues
tofalse
.Then, create a Service for exposing these metrics, targeting all Pods labeled with the Configuration name
akri-udev-video
.The metrics also could have been exposed by adding the metrics port to the Configuration level service in the udev Configuration.
Apply the Service to your cluster.
Create the associated ServiceMonitor. Note how the selector matches the app name of the Service.
Apply the ServiceMonitor to your cluster.
The frame count metric reports the number of video frames that have been requested by some application. It will remain at zero unless an application is deployed that utilizes the video Brokers. Deploy the Akri sample streaming application by running the following:
Last updated