Kubernetes Series

Next, let’s talk about how storage works in Kubernetes. Let’s get started with Kubernetes storage options!

Volumes in Kubernetes

Since Pods are ephemeral, we sometimes require our data on the pod to be available for the next pod scheduled. Or, sometimes containers should share the file in a Pod. Volumes are generally used for addressing those problems.

If you are familiar with the concept of volumes in docker, the concept of volumes is similar in Kubernetes yet more advanced. In Kubernetes, different types of volumes give the user the ability to use volume according to the app’s needs.

There are many types of volumes for different use cases and for different platforms. You may find the entire list here: https://kubernetes.io/docs/concepts/storage/volumes/#volume-types

There are few options to use our local file system for volumes, such as; local, hostPath, and emptyDir. To give you the basics of the concept I’ll be using the emptyDir.

An emptyDir volume is first created when a Pod is assigned to a node and exists as long as that Pod is running on that node. As the name says, the emptyDir volume is initially empty. All containers in the Pod can read and write the same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.

Let’s create our nginx pod once more with an emptyDir volume:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-emptydir
spec:
  containers:
  - image: nginx
    name: nginx
    volumeMounts:
    - mountPath: /tmp
      name: tmp-volume
  volumes:
  - name: tmp-volume
    emptyDir: {}

This YAML is mounting the /tmp directory in the nginx container to an empty directory in the local file system.

kubectl apply -f emptydir.yaml

And if we describe the running pod under mounts we will observe the mounted path.

Kubernetes Storage Classes

A StorageClass provides a way for administrators to describe the “classes” of storage they offer.

Each StorageClass contains the fields provisioner, parameters, and reclaimPolicy, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned.

Basically, with a storage class, you can define the storage that you are going to use for mounting a volume (PersistentVolume).

Under Provisioner, you have to specify the provisioner depending on your storage setup. For instance, if you are using AWS for storage, which means you will be using AWS EBS, then you have to check the config file on the storage block provider. Provisioner field must be provided.

ReclaimPolicy is used for defining what will happen if the user is done using the volume. There are two options available: delete and retain, in which delete is deleting the volume after being used and as you might guess retain is retaining it. This parameter defaults to delete if nothing is provided.

Another important parameter when creating a storage class is VolumeBindingMode. The default mode is Immediate. This mode indicates that volume binding and dynamic provisioning occurs once the PersistentVolumeClaim is created. Another mode is called WaitForFirstConsumer. In this mode, the binding will be delayed and provisioning of a PersistentVolume will only happen when a Pod starts using the PersistentVolumeClaim.

As you can see dynamic provisioning of a PersistentVolume generally happens ( at least in production ) in non-local environments such as cloud etc. In local usage, we might be using statically provisioning of a PersistentVolume but if not a sample of local storage class will be like below:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

and for retrieving storage class:

kubectl get storageclass

kubectl get sc

can be used.

Persistent Volumes

A PersistentVolume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.

When we were talking about volumes and emptyDir examples, we saw that the volume exists as long as the pod exists. But the entire idea of Kubernetes is to have highly available systems. Therefore we need a volume that should exist without depending on a particular Pod. For this purpose, we are using PersistentVolumes.

You may statically or dynamically provision PersistentVolumes.

For static provisioning, details of real storage should be provided. For dynamical provisioning, a storage class should be created (see the section above).

As an example, let’s create a hostPath. Please note that usage of hostPath is not advised for production purposes but only for development and testing.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nginx-pv
spec:
  storageClassName: standard
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"

And to get existing persistent volumes:

kubectl get persistentvolumes

kubectl get pv

can be used.

PersistentVolumeClaims

The PersistentVolume created can be used by multiple sources, therefore each resource should claim its own part out of it. This is done by using PersistentVolumeClaim. In other words, you are claiming your part from that persistent volume.

Important parameters when creating a PVC are;

accessModes: A PersistentVolume can be mounted on a host in any way supported by the resource provider. The access modes are:

ReadWriteOnce — the volume can be mounted as read-write by a single node
ReadOnlyMany — the volume can be mounted read-only by many nodes
ReadWriteMany — the volume can be mounted as read-write by many nodes
ReadWriteOncePod — the volume can be mounted as read-write by a single Pod.

volumeModes: Defaults to FileSystem if not specified, which is the usual file system usage. Another option is Block which uses a volume as a raw block device. The application running in the Pod must know how to handle a raw block device.

resources: The resource which claims the PersistentVolumes (e.g Pod) and the capacity of the request is specified here.

A sample PVC looks like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nginx-pvc
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

Fetching the existing persistent volume claims information is done by either:

kubectl get persistentvolumeclaims

kubectl get pvc

As you can see, the status of the PVC is “Pending”. That is because StorageClass VolumeBindingMode is set to “WaitForFirstConsumer” and yet there is no consumer for it.

Using a PersistentVolume With a Pod

Let’s now go back to our nginx Pod. This time attaching it to the PersistentVolume we have created.

Now we need to specify in our Pod definition the PersistentVolumeClaim. Below is the sample YAML file to do so:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pv-pod
spec:
  volumes:
  - name: nginx-storage
    persistentVolumeClaim:
      claimName: nginx-pvc
  containers:
  - name: nginx-container
    image: nginx
    volumeMounts:
    - mountPath: "/usr/share/nginx/html"
      name: nginx-storage

As you can see under .spec.volume, we name our PersistentVolumeClaim and address it to PVC that we have created earlier (nginx-pvc). Then we use this volume definition under .spec.containers by giving the path of the volume we want and the name of the volume definition (nginx-storage).

After we observe that Pod is up and running: