- System requirements
- Setting up the cluster
- Running the
cmk isolateHello World Pod - Validating the environment
- Troubleshooting and recovery
Related:
Kubernetes >= v1.5.0 (excluding v1.8.0, details below)
All of template manifests provided with CMK are using serviceaccount which is
defined in cmk-serviceaccount manifest. Before first
CMK run, operator should use it to define cmk-serviceaccount. This step isn't
obligatory on Kubernetes 1.5 but it's strongly recomended. Kubernetes 1.6
requires it because of RBAC authorization method which will use it to deliver
API access from inside of CMK pod(s).
From Kubernetes 1.6 RBAC has became default authorization method.
Operator needs to prepare additional ClusterRole and
ClusterRoleBindings in order to deploy CMK.Those are
provided in cmk-rbac-rules manifest. In this case operator
must also use provided serviceaccount manifest as well.
From Kubernetes 1.7 Custom Resource Definitions has replaced Third Party Resource.
Only in Kubernetes 1.7 both are compatible. Operator must migrate from TRP to CRD.
To cmk-rbac-rules manifest ClusterRole and ClusterRoleBindings have been added for CRD.
CMK will detect the version Kubernetes itself and will be use Custom Resource Definitions
if Kubernetes version is 1.7 else Third Party Resource to create Nodereport and Reconcilereport.
Additionally Taints have been moved from alpha to beta and are no logner present in node metadata but directly in spec. Please note that if pod manifest has nodeName: <nodename> selector, taints tolerations are not needed.
Kubernetes 1.8.0 is not supported due to extended resources issue(it's impossible to create extended resource). Use Kubernetes 1.8.1+ instead.
From Kubernetes 1.9.0 mutating admission controller is being used to update any pod which
definition contains any container requesting CMK Extended Resources. CMK webhook modifies
it by injecting environmental variable CMK_NUM_CORES with its value set to a number of cores
specified in the Extended Resource request. This allows cmk isolate to assign multiple
CPU cores to given process.
On top of that webhook applies additional changes to the pod which are defined in
the configuration file. By default, configuration deployed during cmk cluster-init adds
CMK installation and configuration directories and host /proc filesystem volumes, CMK
service account, tolerations required for a pod to be scheduled on the CMK enabled node
and appropriately annotates pod. Containers specifications are updated with volume mounts
(referencing volumes added to the pod) and environmental variable CMK_PROC_FS.
https://kubernetes.io/docs/admin/authorization/rbac/#rolebinding-and-clusterrolebinding
This section describes the setup required to use the CMK software.
Notes:
- The recommended way to prepare Kubernetes nodes for the
CMKsoftware is to runcmk cluster-initas a Pod as described in cluster setup instructions usingcmk cluster-init. - The cluster setup instructions using manually created Pods should only be used if and
only if running
cmk cluster-initfails for some reason.
Prepare the nodes by running cmk cluster-init using these instructions.
- Concepts
- Preparing nodes by running
cmk cluster-init(recommended) - Preparing nodes by running each
CMKsubcommand as a Pod (use only if required)
| Term | Meaning |
|---|---|
CMK nodes |
The operator can choose any number of nodes in the kubernetes cluster to work with CMK. These participating nodes will be referred as CMK nodes. |
| Pod | A Pod is an abstraction in Kubernetes to represent one or more containers and their configuration. It is the smallest schedulable unit in Kubernetes. |
| OIR | Acronym for Opaque Integer Resource. In Kubernetes, OIR allow cluster operators to advertise new node-level resources that would be otherwise unknown to the system. |
| Volume | A volume is a directory (on host file system). In Kubernetes, a volume has the same lifetime as the Pod that uses it. Many types of volumes are supported in Kubernetes. |
hostPath |
hostPath is a volume type in Kubernetes. It mounts a file or directory from the host file system into the Pod. |
CMK nodes can be prepared by using cmk cluster-init subcommand. The subcommand is expected to
be run as a pod. The cmk-cluster-init-pod template can be used to run cmk cluster-init on a
Kubernetes cluster. When run on a Kubernetes cluster, the Pod spawns two Pods per node at most in order to prepare
each node.
The only value that requires change in the cmk-cluster-init-pod template is the args field,
which can be modified to pass different options.
Following are some example modifications to the args field:
- args:
# Change this value to pass different options to cluster-init.
- "/cmk/cmk.py cluster-init --host-list=node1,node2,node3"The above command prepares nodes "node1", "node2" and "node3" for the CMK software using default options.
- args:
# Change this value to pass different options to cluster-init.
- "/cmk/cmk.py cluster-init --all-hosts"The above command prepares all the nodes in the Kubernetes cluster for the CMK software using default options.
- args:
# Change this value to pass different options to cluster-init.
- "/cmk/cmk.py cluster-init --host-list=node1,node2,node3 --cmk-cmd-list=init,discover"The above command prepares nodes "node1", "node2" and "node3" but only runs the cmk init and cmk discover
subcommands on each of those nodes.
For more details on the options provided by cmk cluster-init, see this description.
Notes:
- The instructions provided in this section should only be used if and only if running
cmk cluster-initfails for some reason. - The subcommands described below should be run in the same order.
- The documentation in this section assumes that the
CMKconfiguration directory is/etc/cmkand thecmkbinary is installed on the host under/opt/bin. - In all the pod templates used in this section, the name of container image used is
cmk:v1.3.1. It is expected that thecmkcontainer image is built and cached locally in the host. Theimagefield will require modification if the container image is hosted remotely (e.g., in https://hub.docker.com/).
The CMK nodes in the kubernetes cluster should be initialized in order to be used with the CMK software using
cmk-init. To initialize the CMK nodes, the cmk-init-pod template can be used.
cmk init takes the --conf-dir, --num-exclusive-cores and the --num-shared-cores flags. In the
cmk-init-pod template, the values to these flags can be modified. The value for --conf-dir can be
set by changing the path value of the hostPath for the cmk-conf-dir. The value for --num-exclusive-cores and
--num-shared-cores can be set by changing the values for the NUM_EXCLUSIVE_CORES and NUM_SHARED_CORES environment variables,
respectively.
Values that might require modification in the cmk-init-pod template are shown as snippets below:
volumes:
- hostPath:
# Change this to modify the CMK config dir in the host file system.
path: "/etc/cmk"
name: cmk-conf-dir env:
- name: NUM_EXCLUSIVE_CORES
# Change this to modify the value passed to `--num-exclusive-cores` flag.
value: '4'
- name: NUM_SHARED_CORES
# Change this to modify the value passed to `--num-shared-cores` flag.
value: '1'All the CMK nodes in the Kubernetes cluster should be patched with CMK OIR slots using
cmk discover. The OIR slots are advertised as the exclusive pools need to be allocated exclusively.
The number of slots advertised should be equal to the number of cpu lists under the exclusive pool, as determined
by examining the CMK configuration directory. cmk-discover-pod template can be used to
advertise the CMK OIR slots.
cmk discover takes the --conf-dir flag. In the cmk-discover-pod template, the value for
--conf-dir can be configured by changing the path value of the hostPath for cmk-conf-dir. After running
this Pod in a node, the node will be patched with `pod.alpha.kubernetes.io/opaque-int-resource-cmk' OIR.
Values that might require modification in the cmk-discover-pod template are shown as snippets below:
volumes:
- hostPath:
# Change this to modify the CMK config dir in the host file system.
path: "/etc/cmk"
name: cmk-conf-dirIn order to reconcile from an outdated CMK configuration state, each CMK node should run
cmk reconcile periodically. cmk reconcile can be run periodically using the
cmk-reconcile-daemonset template.
In the cmk-reconcile-daemonset template, the time between each invocation of cmk reconcile
can be adjusted by changing the value of the CMK_RECONCILE_SLEEP_TIME environment variable. The value specifies time
in seconds. cmk reconcile takes the --conf-dir flag. This value can be configured by changing the path
value of the hostPath for the cmk-conf-dir in the cmk-reconcile-daemonset template.
Values that might require modification in the cmk-reconcile-daemonset template are shown as snippets below:
env:
- name: CMK_RECONCILE_SLEEP_TIME
# Change this to modify the sleep interval between consecutive
# cmk reconcile runs. The value is specified in seconds.
value: '60' volumes:
- hostPath:
# Change this to modify the CMK config dir in the host file system.
path: "/etc/cmk"
name: cmk-conf-dircmk install is used to create a zero-dependency binary of the CMK software and place it on the host
filesystem. Subsequent containers can isolate themselves by mounting the install directory from the host and then
calling cmk isolate. To run it on all the CMK nodes, the cmk-install-pod template
can be used.
cmk install takes the --install-dir flag. In the cmk-install-pod template, the value for
--install-dir can be configured by changing the path value of the hostPath for the cmk-install-dir.
Values that might require modification in the cmk-install-pod template are shown as snippets below:
volumes:
- hostPath:
# Change this to modify the CMK installation dir in the host file system.
path: "/opt/bin"
name: cmk-install-dircmk webhook is used to run mutating admission webhook server. Whenever there's a requestto create a new pod,
the webhook can capture that request, check whether any of the containers requests or limits number of the CMK
Extended Resources and update pod and its container specification appropriately. This allows to simplify deployment
of workloads taking advantage of CMK, by reducing the number of requirements to the minimum.
...
spec:
containers:
resources:
requests:
cmk.intel.com/exclusive-cores: 2
...
In order to deploy CMK mutating webhook a number of resources needs to be created on the cluster. But even before that, operator needs to have X509 private key and TLS certificate in PEM format generated. Certificates can be self-signed, although using ceritificates signed by proper CA or Kubernetes Certificates API is highly recommended. After meeting that requirement, steps to deploy webhook are as follows:
- Certificates in PEM format should be then encoded to Base64 format and placed in the Mutating Admission Configuration and Secret templates.
- Update config map template. Config map contains 2
configuration files
server.yamlandmutations.yaml. Configuration options are described in the cmk command-line tool documentation. - Create secret, service and
config map using
kubectl create -f ...command. - Run
cmk webhookpod defined in the webhook pod template usingkubectl create -f ...command. - If the
cmk webhookpod is running correctly, create Mutating Admission Configuration object.
CMK is able to use multiple sockets. During cluster initialization, init module will distribute cores from all sockets
across pools. To prevent a situation, where exclusive pool or shared pool are spawned only on a single socket
operator is able to use one of two mode policies: packed and spread. Those policies define how cores are assigned to
specific pool:
- packed mode will put cores in the following order:
Note: This policy is not topology aware, so there is a possibility that one pool won't spread on multiple sockets.
- spread mode will put following cores order:
Note: This policy is topology aware, so CMK will try to spread pools on each socket.
To select appropriate mode operator can select it during initialization with --shared-mode or --exclusive-mode parameters.
Those parameters can be used with cluster-init and init. If operator use two different modes, then those policies
will be mixed. In that case exclusive pool is resolving before shared pool.
After following the instructions in the previous section, the cluster is ready to run the Hello World Pod. The Hello
World cmk-isolate-pod template describes a simple Pod with three containers requesting CPUs from
the exclusive, shared and the infra pools, respectively, using cmk isolate. The
pool is requested by passing the desired value to the --pool flag when using cmk isolate as described in the
documentation.
cmk isolate can use --socket-id flag to target on which socket application should be spawned. This flag is optional,
suitable only for exclusive pool and if it's not used cmk isolate will use first not reserved core.
cmk isolate also takes the --conf-dir and --install-dir flags. In the cmk-isolate-pod template,
the values for --conf-dir and --install-dir can be modified by changing the path values of the hostPath.
Values that might require modification in the cmk-isolate-pod template are shown as snippets below:
volumes:
- hostPath:
# Change this to modify the CMK installation dir in the host file system.
path: "/opt/bin"
name: cmk-install-dir
- hostPath:
# Change this to modify the CMK config dir in the host file system.
path: "/etc/cmk"
name: cmk-conf-dirNotes:
- The Hello World cmk-isolate-pod consumes the
pod.alpha.kubernetes.io/opaque-int-resource-cmkOpaque Integer Resource (OIR) only in the container isolated using the exclusive pool. TheCMKsoftware assumes that only container isolated using the exclusive pool requests the OIR and each of these containers should consume exactly one OIR. This restricts the number of pods that can land on a Kubernetes node to the expected value. - The
cmk isolateHello World Pod should only be run after following the instructions provided in theSetting up the clustersection.
Following is an example to validate the environment in one node.
- Pick a node to test. For illustration, we will use
<node-name>as the name of the node. - Check if node has appropriate label.
kubectl get node <node-name> -o json | jq .metadata.labelsExample output:
kubectl get node cmk-02-zzwt7w -o json | jq .metadata.labels
{
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"cmk.intel.com/cmk-node": "true",
"kubernetes.io/hostname": "cmk-02-zzwt7w"
}- Check if node has appropriate taint. (kubernetes < v1.7)
kubectl get node <node-name> -o json | jq .metadata.annotationsExample output:
kubectl get node cmk-02-zzwt7w -o json | jq .metadata.annotations
{
"scheduler.alpha.kubernetes.io/taints": "[{\"value\": \"true\", \"key\": \"cmk\", \"effect\": \"NoSchedule\"}]",
"volumes.kubernetes.io/controller-managed-attach-detach": "true"
}- Check if node has appropriate taint. (kubernetes >= v1.7)
kubectl get node <node-name> -o json | jq .spec.taintsExample output:
kubectl get node cmk-02-zzwt7w -o json | jq .spec.taints
[
{
"effect": "NoSchedule",
"key": "cmk",
"timeAdded": null,
"value": "true"
}
]
- Check if node has the appropriate OIR. (kubernetes < v1.8)
kubectl get node <node-name> -o json | jq .status.capacityExample output:
kubectl get node cmk-02-zzwt7w -o json | jq .status.capacity
{
"alpha.kubernetes.io/nvidia-gpu": "0",
"cpu": "16",
"memory": "14778328Ki",
"pod.alpha.kubernetes.io/opaque-int-resource-cmk": "4",
"pods": "110"
}- Check if node has the appropriate ER. (kubernetes >= v1.8.1)
kubectl get node <node-name> -o json | jq .status.capacityExample output:
kubectl get node cmk-02-zzwt7w -o json | jq .status.capacity
{
"alpha.kubernetes.io/nvidia-gpu": "0",
"cpu": "16",
"memory": "14778328Ki",
"cmk.intel.com/exclusive-cores": "4",
"pods": "110"
}- Login to the node and check if
CMKconfiguration directory and binary exisits. Assuming default options were used forcmk cluster-init, you would do the following:
ls /etc/cmk/
ls /opt/bin/- Replace the
nodeNamein the Pod manifest below to the chosen node name and save it to a file.
apiVersion: v1
kind: Pod
metadata:
labels:
app: cmk-isolate-pod
name: cmk-isolate-pod
spec:
# Change this to the <node-name> you want to test.
nodeName: NODENAME
containers:
- args:
- "/opt/bin/cmk isolate --conf-dir=/etc/cmk --pool=infra sleep -- 10000"
command:
- "/bin/bash"
- "-c"
env:
- name: CMK_PROC_FS
value: "/host/proc"
image: cmk:v1.3.1
imagePullPolicy: "Never"
name: cmk-isolate-infra
volumeMounts:
- mountPath: "/host/proc"
name: host-proc
readOnly: true
- mountPath: "/opt/bin"
name: cmk-install-dir
- mountPath: "/etc/cmk"
name: cmk-conf-dir
restartPolicy: Never
volumes:
- hostPath:
# Change this to modify the CMK installation dir in the host file system.
path: "/opt/bin"
name: cmk-install-dir
- hostPath:
path: "/proc"
name: host-proc
- hostPath:
# Change this to modify the CMK config dir in the host file system.
path: "/etc/cmk"
name: cmk-conf-dir- Run
kubectl create -f <file-name>, where<file-name>is name of the Pod manifest file withnodeNamefield substituted as mentioned in the previous step. - Check if any process is isolated in the
infrapool usingNodeReportfor that node. If you using third part resources (kubernetes 1.6.x and older versions)kubectl get NodeReport <node-name> -o json | jq .report.description.pools.infraIf you using custom resources definition (kubernetes 1.7.x and newer versions)kubectl get cmk-nodereport <node-name> -o json | jq .spec.report.description.pools.infra
- Follow all the above steps, but use simplified Pod manifest:
apiVersion: v1
kind: Pod
metadata:
labels:
app: cmk-isolate-pod
name: cmk-isolate-pod
spec:
# Change this to the <node-name> you want to test.
nodeName: NODENAME
containers:
- args:
- "/opt/bin/cmk isolate --conf-dir=/etc/cmk --pool=exclusive sleep -- 10000"
command:
- "/bin/bash"
- "-c"
env:
image: cmk:v1.3.1
imagePullPolicy: "Never"
name: cmk-isolate-infra
resources:
requests:
cmk.intel.com/exclusive-cores: 1
restartPolicy: Never
- Run
kubectl create -f <file-name>, where<file-name>is the name of the Pod manifest file with nodeName field substituted as mentioned in the previous section. - Run
kubectl get pod cmk-isolate-pod -o json | jq .metadata.annotationsand verify that annotation has been added:
{
"cmk.intel.com/resources-injected": "true"
}
- Run
kubectl get pod cmk-isolate-pod -o json | jq .spec.volumesand verify that extra volumes have been injected:
[
{
"name": "default-token-xfd8q",
"secret": {
"defaultMode": 420,
"secretName": "default-token-xfd8q"
}
},
{
"hostPath": {
"path": "/proc",
"type": ""
},
"name": "cmk-host-proc"
},
{
"hostPath": {
"path": "/etc/cmk",
"type": ""
},
"name": "cmk-config-dir"
},
{
"hostPath": {
"path": "/opt/bin",
"type": ""
},
"name": "cmk-install-dir"
}
]
- Run
kubectl get pod cmk-isolate-pod -o json | jq .spec.containers[0].envand verify that env variables have been added to the container spec:
[
{
"name": "CMK_PROC_FS",
"value": "/host/proc"
},
{
"name": "CMK_NUM_CORES",
"value": "1"
}
]
If running cmk cluster-init using the cmk-cluster-init-pod template ends up in an error,
the recommended way to start troubleshooting is to look at the logs using kubectl logs POD_NAME [CONTAINER_NAME] -f.
For example, assuming you ran the cmk-cluster-init-pod template with default options, it
should create two pods on each node named cmk-init-install-discover-pod-<node-name> and
cmk-reconcile-nodereport-<node-name>, where <node-name> should be replaced with the name of the node.
If you want to look at the logs from the container which ran the discover subcommand in the pod, you can use
kubectl logs -f cmk-init-install-discover-pod-<node-name> discover
If you want to look at the logs from the container which ran the reconcile subcommand in the pod, you can use
kubectl logs -f cmk-reconcile-nodereport-pod-<node-name> reconcile
If you want to remove cmk use cmk-uninstall-pod.yaml. nodeSelector
can help to fine-grain the deletion for specific node.

