Setup a MultiKueue environment
This tutorial explains how you can configure a management cluster and one worker cluster to run JobSets and batch/Jobs in a MultiKueue environment.
Check the concepts section for a MultiKueue overview.
Let’s assume that your manager cluster is named manager-cluster and your worker cluster is named worker1-cluster.
To follow this tutorial, ensure that the credentials for all these clusters are present in the kubeconfig in your local machine.
Check the kubectl documentation to learn more about how to Configure Access to Multiple Clusters.
In the Worker Cluster
Note
Make sure your current kubectl configuration points to the worker cluster.
Run:
kubectl config use-context worker1-cluster
When MultiKueue dispatches a workload from the manager cluster to a worker cluster, it expects that the job’s namespace and LocalQueue also exist in the worker cluster. In other words, you should ensure that the worker cluster configuration mirrors the one of the manager cluster in terms of namespaces and LocalQueues.
To create the sample queue setup in the default namespace, you can apply the following manifest:
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: "default-flavor"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: "cluster-queue"
spec:
namespaceSelector: {} # match all.
resourceGroups:
- coveredResources: ["cpu", "memory"]
flavors:
- name: "default-flavor"
resources:
- name: "cpu"
nominalQuota: 9
- name: "memory"
nominalQuota: 36Gi
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: "default"
name: "user-queue"
spec:
clusterQueue: "cluster-queue"
MultiKueue Specific Kubeconfig
In order to delegate the jobs in a worker cluster, the manager cluster needs to be able to create, delete, and watch workloads and their parent Jobs.
While kubectl is set up to use the worker cluster, download:
#!/bin/bash
# Copyright 2024 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -o errexit
set -o nounset
set -o pipefail
KUBECONFIG_OUT=${1:-kubeconfig}
MULTIKUEUE_SA=multikueue-sa
NAMESPACE=kueue-system
# Creating a restricted MultiKueue role, service account and role binding"
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: ${MULTIKUEUE_SA}
namespace: ${NAMESPACE}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ${MULTIKUEUE_SA}-role
rules:
- apiGroups:
- batch
resources:
- jobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- batch
resources:
- jobs/status
verbs:
- get
- apiGroups:
- ""
resources:
- pods
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- jobset.x-k8s.io
resources:
- jobsets
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- jobset.x-k8s.io
resources:
- jobsets/status
verbs:
- get
- apiGroups:
- kueue.x-k8s.io
resources:
- workloads
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- kueue.x-k8s.io
resources:
- workloads/status
verbs:
- get
- patch
- update
- apiGroups:
- kubeflow.org
resources:
- tfjobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- kubeflow.org
resources:
- tfjobs/status
verbs:
- get
- apiGroups:
- kubeflow.org
resources:
- paddlejobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- kubeflow.org
resources:
- paddlejobs/status
verbs:
- get
- apiGroups:
- kubeflow.org
resources:
- pytorchjobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- kubeflow.org
resources:
- pytorchjobs/status
verbs:
- get
- apiGroups:
- kubeflow.org
resources:
- xgboostjobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- kubeflow.org
resources:
- xgboostjobs/status
verbs:
- get
- apiGroups:
- kubeflow.org
resources:
- mpijobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- kubeflow.org
resources:
- mpijobs/status
verbs:
- get
- apiGroups:
- ray.io
resources:
- rayjobs
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- ray.io
resources:
- rayjobs/status
verbs:
- get
- apiGroups:
- ray.io
resources:
- rayclusters
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- ray.io
resources:
- rayclusters/status
verbs:
- get
- apiGroups:
- workload.codeflare.dev
resources:
- appwrappers
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- workload.codeflare.dev
resources:
- appwrappers/status
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ${MULTIKUEUE_SA}-crb
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ${MULTIKUEUE_SA}-role
subjects:
- kind: ServiceAccount
name: ${MULTIKUEUE_SA}
namespace: ${NAMESPACE}
EOF
# Get or create a secret bound to the new service account.
SA_SECRET_NAME=$(kubectl get -n ${NAMESPACE} sa/${MULTIKUEUE_SA} -o "jsonpath={.secrets[0]..name}")
if [ -z "$SA_SECRET_NAME" ]
then
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
name: ${MULTIKUEUE_SA}
namespace: ${NAMESPACE}
annotations:
kubernetes.io/service-account.name: "${MULTIKUEUE_SA}"
EOF
SA_SECRET_NAME=${MULTIKUEUE_SA}
fi
# Note: service account token is stored base64-encoded in the secret but must
# be plaintext in kubeconfig.
SA_TOKEN=$(kubectl get -n ${NAMESPACE} "secrets/${SA_SECRET_NAME}" -o "jsonpath={.data['token']}" | base64 -d)
CA_CERT=$(kubectl get -n ${NAMESPACE} "secrets/${SA_SECRET_NAME}" -o "jsonpath={.data['ca\.crt']}")
# Extract cluster IP from the current context
CURRENT_CONTEXT=$(kubectl config current-context)
CURRENT_CLUSTER=$(kubectl config view -o jsonpath="{.contexts[?(@.name == \"${CURRENT_CONTEXT}\"})].context.cluster}")
CURRENT_CLUSTER_ADDR=$(kubectl config view -o jsonpath="{.clusters[?(@.name == \"${CURRENT_CLUSTER}\"})].cluster.server}")
# Create the Kubeconfig file
echo "Writing kubeconfig in ${KUBECONFIG_OUT}"
cat > "${KUBECONFIG_OUT}" <<EOF
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: ${CA_CERT}
server: ${CURRENT_CLUSTER_ADDR}
name: ${CURRENT_CLUSTER}
contexts:
- context:
cluster: ${CURRENT_CLUSTER}
user: ${CURRENT_CLUSTER}-${MULTIKUEUE_SA}
name: ${CURRENT_CONTEXT}
current-context: ${CURRENT_CONTEXT}
kind: Config
preferences: {}
users:
- name: ${CURRENT_CLUSTER}-${MULTIKUEUE_SA}
user:
token: ${SA_TOKEN}
EOF
And run:
chmod +x create-multikueue-kubeconfig.sh
./create-multikueue-kubeconfig.sh worker1.kubeconfig
To create a Kubeconfig that can be used in the manager cluster to delegate Jobs in the current worker.
Security Notice
MultiKueue validates kubeconfig files to protect against known arbitrary code execution vulnerabilities. For your security, it is strongly recommended not to use the MultiKueueAllowInsecureKubeconfigs flag. This flag was introduced in Kueue v0.15.0 solely for backward compatibility and will be deprecated in Kueue v0.17.0.Kubeflow Installation
Install Kubeflow Trainer in the Worker cluster (see Kubeflow Trainer Installation for more details). Please use version v1.7.0 or a newer version for MultiKueue.
In the Manager Cluster
Note
Make sure your current kubectl configuration points to the manager cluster.
Run:
kubectl config use-context manager-cluster
CRDs installation
For installation of CRDs compatible with MultiKueue please refer to the dedicated pages here.
Create worker’s Kubeconfig secret
For the next example, having the worker1 cluster Kubeconfig stored in a file called worker1.kubeconfig, you can create the worker1-secret secret by running the following command:
kubectl create secret generic worker1-secret -n kueue-system --from-file=kubeconfig=worker1.kubeconfig
Check the worker section for details on Kubeconfig generation.
Create a sample setup
Apply the following to create a sample setup in which the Jobs submitted in the ClusterQueue cluster-queue are delegated to a worker worker1
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: "default-flavor"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: "cluster-queue"
spec:
namespaceSelector: {} # match all.
resourceGroups:
- coveredResources: ["cpu", "memory"]
flavors:
- name: "default-flavor"
resources:
- name: "cpu"
nominalQuota: 9
- name: "memory"
nominalQuota: 36Gi
admissionChecksStrategy:
admissionChecks:
- name: sample-multikueue
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: "default"
name: "user-queue"
spec:
clusterQueue: "cluster-queue"
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: AdmissionCheck
metadata:
name: sample-multikueue
spec:
controllerName: kueue.x-k8s.io/multikueue
parameters:
apiGroup: kueue.x-k8s.io
kind: MultiKueueConfig
name: multikueue-test
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: MultiKueueConfig
metadata:
name: multikueue-test
spec:
clusters:
- multikueue-test-worker1
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: MultiKueueCluster
metadata:
name: multikueue-test-worker1
spec:
clusterSource:
kubeConfig:
locationType: Secret
location: worker1-secret
# a secret called "worker1-secret" should be created in the namespace the kueue
# controller manager runs into, holding the kubeConfig needed to connect to the
# worker cluster in the "kubeconfig" key;
Upon successful configuration the created ClusterQueue, AdmissionCheck and MultiKueueCluster will become active.
Run:
kubectl get clusterqueues cluster-queue -o jsonpath="{range .status.conditions[?(@.type == \"Active\")]}CQ - Active: {@.status} Reason: {@.reason} Message: {@.message}{'\n'}{end}"
kubectl get admissionchecks sample-multikueue -o jsonpath="{range .status.conditions[?(@.type == \"Active\")]}AC - Active: {@.status} Reason: {@.reason} Message: {@.message}{'\n'}{end}"
kubectl get multikueuecluster multikueue-test-worker1 -o jsonpath="{range .status.conditions[?(@.type == \"Active\")]}MC - Active: {@.status} Reason: {@.reason} Message: {@.message}{'\n'}{end}"
And expect an output like:
CQ - Active: True Reason: Ready Message: Can admit new workloads
AC - Active: True Reason: Active Message: The admission check is active
MC - Active: True Reason: Active Message: Connected
Create a sample setup with TAS
To enable Topology-Aware Scheduling (TAS) in a MultiKueue setup, configure the worker clusters with topology levels and the manager cluster with delayed topology requests.
Worker cluster configuration:
apiVersion: kueue.x-k8s.io/v1beta2
kind: Topology
metadata:
name: default
spec:
levels:
- nodeLabel: kubernetes.io/hostname
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: tas-flavor
spec:
nodeLabels:
cloud.provider.com/node-group: tas-node
topologyName: default
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: worker-cluster-queue
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["cpu", "memory"]
flavors:
- name: tas-flavor
resources:
- name: cpu
nominalQuota: 8
- name: memory
nominalQuota: 16Gi
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: default
name: user-queue
spec:
clusterQueue: worker-cluster-queue
Manager cluster configuration:
apiVersion: kueue.x-k8s.io/v1beta2
kind: Topology
metadata:
name: default
spec:
levels:
- nodeLabel: kubernetes.io/hostname
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: tas-flavor
spec:
nodeLabels:
cloud.provider.com/node-group: tas-node
topologyName: default
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: MultiKueueConfig
metadata:
name: multikueue-config
spec:
clusters:
- worker1
- worker2
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: AdmissionCheck
metadata:
name: multikueue-ac
spec:
controllerName: kueue.x-k8s.io/multikueue
parameters:
apiGroup: kueue.x-k8s.io
kind: MultiKueueConfig
name: multikueue-config
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: MultiKueueCluster
metadata:
name: worker1
spec:
clusterSource:
kubeConfig:
locationType: Secret
location: worker1-secret
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: MultiKueueCluster
metadata:
name: worker2
spec:
clusterSource:
kubeConfig:
locationType: Secret
location: worker2-secret
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: cluster-queue
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["cpu", "memory"]
flavors:
- name: tas-flavor
resources:
- name: cpu
nominalQuota: 16
- name: memory
nominalQuota: 32Gi
admissionChecksStrategy:
admissionChecks:
- name: multikueue-ac
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
namespace: default
name: user-queue
spec:
clusterQueue: cluster-queue
For a complete setup guide including local development with Kind, see the Setup MultiKueue with Topology-Aware Scheduling guide.
(Optional) Setup MultiKueue with Open Cluster Management
Open Cluster Management (OCM) is a community-driven project focused on multicluster and multicloud scenarios for Kubernetes apps. It provides a robust, modular, and extensible framework that helps other open source projects orchestrate, schedule, and manage workloads across multiple clusters.
The integration with OCM is an optional solution that enables Kueue users to streamline the MultiKueue setup process, automate the generation of MultiKueue specific Kubeconfig, and enhance multicluster scheduling capabilities. For more details about this solution, please refer to this link.
Setup MultiKueue with ClusterProfile API
The ClusterProfile API provides a standardized, vendor-neutral interface for presenting cluster information. It allows defining cluster access information in a standardized ClusterProfile object and using credential plugins for authentication.
Enable MultiKueueClusterProfile feature gate
Enable the MultiKueueClusterProfile feature gate. Refer to the
Installation guide
for instructions on configuring feature gates.
Create ClusterProfile objects
If you are using a cloud provider, refer to the documentation on how to generate ClusterProfile objects (e.g. GKE). Alternatively, you can manually install the ClusterProfile CRD and objects for your clusters.
To install the ClusterProfile CRD, run:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-inventory-api/refs/heads/main/config/crd/bases/multicluster.x-k8s.io_clusterprofiles.yaml
To create a ClusterProfile object for worker1-cluster, run:
apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ClusterProfile
metadata:
name: worker1-cluster
namespace: kueue-system
spec:
...
status:
accessProviders:
- name: ${PROVIDER_NAME}
cluster:
server: https://${SERVER_ENDPOINT}
certificate-authority-data: ${CERTIFICATE_AUTHORITY_DATA}
Configure Kueue Manager
Next, configure the controller manager config map with the credentials providers.
apiVersion: v1
data:
controller_manager_config.yaml: |
...
multiKueue:
clusterProfile:
credentialsProviders:
- name: ${PROVIDER_NAME}
execConfig:
apiVersion: client.authentication.k8s.io/v1beta1
command: /plugins/${PLUGIN_COMMAND}
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/name: kueue
control-plane: controller-manager
name: kueue-manager-config
namespace: kueue-system
Install Required Plugins
If your credentials provider requires an executable plugin, you must make it available to the Kueue manager.
Add plugins via volume mounts
You can use an initContainer to add the plugin to a shared emptyDir volume before the Kueue manager starts. The kueue-controller-manager container can then mount this volume to access the plugin.
Here is an example patch for the kueue-controller-manager deployment that adds a custom authentication plugin:
apiVersion: apps/v1
kind: Deployment
metadata:
name: kueue-controller-manager
namespace: kueue-system
spec:
template:
spec:
initContainers:
- name: add-auth-plugin
image: ${PLUGIN_IMAGE}
command: ["cp", "${PLUGIN_COMMAND}", "/plugins/${PLUGIN_COMMAND}"]
volumeMounts:
- name: clusterprofile-plugins
mountPath: "/plugins"
containers:
- name: manager
volumeMounts:
- name: clusterprofile-plugins
mountPath: "/plugins"
volumes:
- name: clusterprofile-plugins
emptyDir: {}
This patch does the following:
- Adds an
initContainerthat copies the${PLUGIN_COMMAND}from its container image to the/pluginsdirectory in the shared volume. - Adds an
emptyDirvolume namedclusterprofile-pluginsto the pod. - Mounts the
clusterprofile-pluginsvolume to themanagercontainer, making the plugin available at/plugins/${PLUGIN_COMMAND}.
Build a custom image
Alternatively, you can build a custom Kueue manager image that includes your plugin. You would then update your Kueue deployment to use this new image.
Configure MultiKueueCluster objects
When using the ClusterProfile API for authentication, configure your MultiKueueCluster objects to reference a ClusterProfile via the clusterProfileRef field, instead of providing kubeconfig directly.
Here’s an example MultiKueueCluster object using a clusterProfileRef:
apiVersion: kueue.x-k8s.io/v1beta2
kind: MultiKueueCluster
metadata:
name: worker1-cluster
spec:
clusterSource:
clusterProfileRef:
name: worker1-cluster
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.