Configuration.capacity
Instance.deviceUsage
Configuration.capacity
. If the capacity is 5, then the deviceUsage map will have 5 mappings, or slots. The slots are named using a simple pattern, in this case, the initial deviceUsage might look like:my-resource-00095f-2: "node-a"
)Instance.deviceUsage
map to claim the slot, and will allow the kubelet to schedule its intended workload. After this, the Instance.deviceUsage
may look something like this:node-a
to claim slot my-resource-00095f-3
, every Akri Agent that can access this instance will react by notifying the kubelet that this slot is no longer available:Instance.deviceUsage
slot has been claimed, but before other Nodes have reported the slot as Unhealthy?Instance.deviceUsage
, an error is returned to the kubelet and the workload will not be scheduled. Instead, the pod will stay in a Pending
state until the Akri Controller brings it down. The Akri Agent will immediately notify the kubelet of the accurate deviceUsage
slot availability and continue to periodically do this (as usual). Once the pod has been brought down by the Controller, if there are still some slots available, the Controller may reschedule the pod to that Node. Then, the kubelet can attempt to reserve a slot again, this time hopefully not hitting a collision.Instance.deviceUsage
maps are accurate. Any slots found without a backing container are cleared out (after a 5 minute timeout, that allows for a container to temporarily disappear).