Overview
Last updated
Was this helpful?
Last updated
Was this helpful?
This document will describe Akri's components. The word "resource" is used to describe what is being searched for and ultimately utilized. Resources offer services. For example, they can be USB or IP cameras, which serve video frames, or GPUs, which provide computation. They can be locally attached, embedded, or remotely accessible to worker nodes, such as USB devices, GPUs, and IP cameras, respectively.
Akri's architecture is made up of five key components: two custom resources, Discovery Handlers, an Agent (device plugin implementation), and a custom Controller. The first custom resource, the Akri Configuration, is where you name it. This tells Akri what kind of device it should look for. At this point, Akri finds it! Akri's Discovery Handlers look for the device and inform the Agent of discovered devices. The Agent then creates Akri's second custom resource, the Akri Instance, to track the availability and usage of the device. Having found your device, the Akri Controller helps you use it. It sees each Akri Instance (which represents a leaf device) and deploys a ("broker") Pod that knows how to connect to the resource and utilize it.
There are two Akri CRDs:
Configuration
Instance
the desired discovery protocol used for finding resources, i.e. ONVIF, OPC-UA or udev.
a capacity (spec.capacity) that defines the maximum number of nodes that may schedule workloads on this resource.
a PodSpec (spec.brokerPodSpec) that defines the "broker" pod that will be scheduled to each of these reported resources.
a ServiceSpec (spec.instanceServiceSpec) that defines the service that provides a single stable endpoint to access each individual resource's set of broker pods.
a ServiceSpec (spec.configurationServiceSpec) that defines the service that provides a single stable endpoint to access the set of all brokers for all resources associated with the Configuration.
Akri Helm Chart already provides three Configurations, one for discovering IP cameras using the ONVIF protocol, one for OPC-UA devices, and one for discovering node devices via udev.
The basic flow of the Akri Agent is:
Watch for Configuration changes to determine what resources to search for
Monitor resource availability (as edge devices may come and go) to determine what resources to advertise
Inform Kubernetes of resource health/availability as it changes
This basic flow combined with the state stored in the Instance allows multiple nodes to share a resource while respecting the limitations defined by Configuration.capacity.
The Akri controller serves two purposes:
Handle (create and/or delete) the Pods & Services that enable resource availability
Ensure that Instances are aligned to the cluster state at any given moment
To achieve these goals, the basic flow of the controller is:
Watch for Instance changes to determine what Pods and Services should exist
Watch for Nodes that are contained in Instances that no longer exist
This basic flow allows the Akri controller to ensure that protocol brokers and Kubernetes Services are running on all nodes exposing desired resources while respecting the limitations defined by Configuration.capacity
.
Operator applies a Configuration with a capacity of 3 to the single node cluster.
The Akri Agent sees the Configuration and discovers a leaf device using the protocol specified in the Configuration. It creates a device plugin for that leaf device and registers it with the kubelet. When creating the device plugin, it tells the kubelet to set connection information for that specific device and additional metadata from a Configuration's brokerProperties
as environment variables in all Pods that request this device's resource. This information is also set in the brokerProperties
section of the Instance the Agent creates to represent the discovered leaf device. In the Instance, the Agent also lists itself as a node that can access the device under nodes
. Note how Instance has 3 available deviceUsage
slots, since capacity was set to 3 and no brokers have been scheduled to the leaf device yet.
The Controller is notified by the API Server of Instance changes. It is informed that a new Instance has been created. It schedules a pod to one of the nodes on the Instance’s nodes list, adding the Instance’s name as a resource limit of the pod. Note that the pod is currently in pending state.
The kubelet on the selected node sees the scheduled pod and resource limit. It checks to see if the resource is available by calling allocate
on the device plugin running in the Agent for the requested leaf device. When calling allocate
, the kubelet requests a specific deviceUsage
slot. Let's say the kubelet requested akri-<protocolA>-<hash>-1
. The leaf device's device plugin checks to see that the requested deviceUsage
slot has not been taken by another node. If it is available, it reserves that deviceUsage
slot for this node (as shown below) and returns true. In the allocate
response, the Agent also tells kubelet to mount the Instance.brokerProperties
as environment variables in the broker Pod.
The configuration of Akri is enabled by the Configuration CRD. Akri users will create Configurations to describe what resources should be discovered and what pod should be deployed on the nodes that discover a resource. Take a look at the . It specifies what components all Configurations must have, including the following:
Let's look at an . You can see it specifies the protocol ONVIF, an image for the broker pod, a capacity of 5, and two Kubernetes services. In this case, the broker pod is a sample frame server we have provided. To get only the frames from a specific camera, a user could point an application at the Instance service, while the Configuration service provides the frames from all the cameras.The ONVIF Configuration can be customized using Helm. When installing the ONVIF Configuration to your Akri enabled cluster, you can specify you want to be inserted into the . Learn more about .
Each Instance represents an individual resource that is visible to the cluster. So, if there are 5 IP cameras visible to the cluster, there will be 5 Instances. Akri coordination and resource sharing is enabled by the Instance CRD. These instances store internal Akri state and are not intended to be edited by users. For a more in-depth understanding on how resource sharing is accomplished, see .
The Akri Agent implements for discovered resources.
For a more in-depth understanding, see .
A Discovery Handlers discover devices around the cluster, whether connected to Nodes (ie USB sensors), embedded in Nodes (ie GPUs), or on the network (ie IP cameras) and report them to the Agent. They are oftentimes protocol implementations for discovering a set of devices, whether a network protocol like OPC UA or a proprietary protocol. Discovery Handlers implement the DiscoveryHandler
service defined in . In order to be utilized, a Discovery Handler must register with the Agent, which hosts the Registration
service defined in .
To get started creating a Discovery Handler, see .
For a more in-depth understanding, see .
Allocate will return false
if kubelet requests a deviceUsage
slot that is already taken. See the for a better understanding on how this is resolved. Otherwise, upon a true
result, the kubelet will run the pod. The broker is now running and has the information necessary to communicate with the specific device.