Kubernetes Series – 5: Scheduling in Kubernetes

In this article, we will learn scheduling in Kubernetes. In Kubernetes, scheduling refers to making sure that Pods are matched to Nodes so that Kubelet can run them.

The scheduler watches for every newly created pod or other unscheduled pods, and the kube-scheduler selects an optimal node for them to run on. 

In a cluster, Nodes that meet the scheduling requirements for a Pod are called feasible nodes. If none of the nodes are suitable, the Pod remains unscheduled until the scheduler is able to place it.

The scheduler finds feasible Nodes for a Pod and then runs a set of functions to score the feasible Nodes and picks a Node with the highest score among the feasible ones to run the Pod. The scheduler then notifies the API server about this decision in a process called binding.

Factors that need to be taken into account for scheduling decisions include individual and collective resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, and so on.

What if we want to select the node on which our objects will run. There are few different ways to run a Kubernetes object on a specific node that we want. 

Let’s check our existing nodes once again by:

kubectl get nodes

scheduling in kubernetes

I’ve created my k8s cluster using kind hence the name of my nodes all start with ‘kind’.

Manual Scheduling

We can manually assign the node by using the “nodeName” attribute under spec. For instance, let’s create our beloved nginx pod once again by assigning a node name this time.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  nodeName: kind-worker2        
  containers:
  - name: nginx          
    image: nginx

Let’s create the nginx pod again with this updated YAML file by:

kubectl apply -f nginx.yaml

scheduling in kubernetes

First, let’s check if the pod is up and running. 

Now let’s describe the pod to see which node was used to run it by:

kubectl describe pod nginx

scheduling in kubernetes

As you can see, the pod has been created on the “kind-worker2” node.

Labeling And Selectors

Another way of running an object in the specific node(s) is to label the node. After we label the node we need to define in our object definition that we want to select a node that has that label. In this approach more than one node might have been labeled with the same label therefore our Kubernetes object will run on any of those nodes.

For labeling a node (you may change the key-value pair as you wish):

kubectl label nodes kind-worker nginx=pod

scheduling in kubernetes

If we describe the node we will observe the label under “Labels”:

scheduling in kubernetes

Let’s create our nginx pod again this time with a nodeSelector under spec.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx          
    image: nginx
  nodeSelector:
    nginx:”pod”

Once we apply this file we will observe that our pod has started in the “kind-worker” node.

kubectl get pods -o wide

scheduling in kubernetes

Node Affinity

Node affinity is similar to nodeSelector as they both use labels for selecting the node that an object will run on. There are two types of node affinity as of the day this article is written:

  • requiredDuringSchedulingIgnoredDuringExecution
  • preferredDuringSchedulingIgnoredDuringExecution

The first affinity, requiredDuringSchedulingIgnoredDuringExecution, is used for a case that the label must exist on the node during scheduling of the k8s object to a node. The second one, preferredDuringSchedulingIgnoredDuringExecution, on the other hand, is used for increasing the chance of a k8s object to be scheduled on a node. So the k8s object desired will run on that node if possible.

Let’s see how they both work in action. I will create a label for each of my worker nodes.

kubectl label node kind-worker must=label

kubectl label node kind-worker2 nicetohave=label

scheduling in kubernetes

Now rebuild our most favorite app of all times, nginx, once again. This time with nodeAffinity

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: must
            operator: In
            values:
            - label
  containers:
  - name: nginx          
    image: nginx

Once I apply this file, we will observe that it runs on the kind-worker node.

scheduling in kubernetes

Let’s try to use the same thing with:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: nicetohave
            operator: In
            values:
            - label
  containers:
  - name: nginx          
    image: nginx

Now if we checked it since there are no other objects being scheduled and running, the chances are it will be scheduled in the node that has the label nicetohave=label. 

scheduling in kubernetes

Resource Management

Another important parameter when scheduling a pod is the resources. Imagine you have an application that you have containerized that requires a certain amount of CPU power and memory. Then you might want to request a certain amount of each of these items. 

Let’s see how we can do it by running our nginx app once again, this time demanding resources from the node and limiting the usage of resources by a container.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx          
    image: nginx
            resources:
              requests:
          memory: "64Mi"
          cpu: "250m"
      limits:
          memory: "128Mi"
          cpu: "500m"
kubectl describe pod nginx

scheduling in kubernetes

As we can see nginx app is now limited to 500m CPU and 128Mi memory and demands 250m CPU and 64Mi memory.

Taints and Tolerations

Until this point, we only talked about the cases where we want to run a Kubernetes object on a specific node. But what if we want the nodes to repel the pods. If that’s the case, we need to use taints. 

Similar to node affinity taints works with labels as well. This time we need to taint a node with a label and an effect. For instance:

kubectl taint nodes kind-worker taint=node:NoSchedule

scheduling in kubernetes

This way I have tainted my first worker node ( kind-worker) with the labels taint=node and the effect NoSchedule. This means on this node, no Kubernetes object will be scheduled on this pod unless they are tolerated. Let’s see it in action.

kubectl run nginx --image=nginx

Once again I am running nginx and expecting it to be scheduled on my second worker node.

scheduling in kubernetes

How to tolerate a Kubernetes object on that node then?

For a tainted node to tolerate a k8s object, you need to add a toleration ( .spec.toleration). An example of this:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx          
    image: nginx
  tolerations:
  - key: "taint"
    operator: "Equal"
    value: "node"
    effect: "NoSchedule"

Now with this, tolerance will be added to the Pod to be run on a tainted node with the same key, value, and effect.

Now we see it is run by the node that has been tolerated. 

An important note here: If I have 2 or more nodes and not many things running on them, the scheduler would probably choose untainted nodes. For the nginx app to run on a tainted node that it is tolerating I removed my other worker node.

That’s all about scheduling, let’s move on to the next episode!

Kubernetes Series

Thanks for reading,
Ege Aksoz

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.