4 min read

Kubernetes storage and CSI drivers introduction

Kubernetes storage and CSI drivers introduction

Kubernetes storage

Kubernetes provides persistent storage using volumes.

Entities on Kubernetes, would you like to call 'em objects, are represented via YAML specification. You define the object as a YAML file, holding the specification you need, and apply it to the cluster. Kubernetes API handles the incoming requests and the process triggers the K8s machinery to converge the state on the cluster to the state defined in the specification file via controllers or operators.

From the perspective of the Kubernetes API three important objects are:

  • PersistentVolume
  • PersistentVolumeClaim
  • StorageClass

PersistentVolume(PV) is used to create a volume to be used by pods via PersistentVolumeClaims(PVC). PersistentVolume can be created statically or dynamically via StorageClass(SC) objects.

PVC takes amount of storage from PV and reserve it for itself.

From the perspective of the Pod you can define set of the storage to be used via volumes and volumeMounts (not to be mistaken with PV or PVC).

Volumes defines the set of volumes which can be used in the volumeMounts set. Volumes also define the type of volume. It can be:

  • persistentVolumeClaim
  • hostPath
  • emptyDir (ephemeral)
  • configMap
  • secret
  • All others

All these types are part of the in-tree Kubernetes volumes. All other volume types are migrated now to CSI plugins which provide external providers the ability to create their own plugins which can be used in the volumes section. More on this later on.

To have more granularity over which directory is used as the root directory one can provide this info to the pod via volumeMounts subPath field. In this way, you can achieve that one volume holds data for different pods, and that bounds are respected. In one sentence: subPath allows you to specify which directory in the volume is mounted as root dir. Not to be mixed that subPath will tell on which directory in the pod you want to mount the volume.

From the perspective of the container, volume is a just standard block device, mounted as a directory. Users can interact via OS file I/O system with the directory and all the changes will be persisted.

Another type of storage available to the containers inside the pod is ephemeral storage. That is the last layer of the container which is writeable apart from all other layers inside the container filesystem. Read more about overlayfs for example.
What is meant by this is that all the changes to the file system will be gone after the container is restarted or deleted.

Volume type of emptyDir is example of ephemeral storage - which in fact is what is happening behind the scenes when the pod is created. Read up more on emptyDir.

Default EmptyDir will be seen in the pods as overlayfs mostly. There is no limit on how much of this storage can be used by the pod. It will use the storage from where kubelet is placed on the node. I suggest that you read up on EmptyDir docs to see how to set constraints on ephemeral storage.

Kubernetes Out-of-tree plugins

All of the volume plugins were compiled in the Kubernetes binary as the project started. As the project evolved the community found that it was not convenient that the whole binary should be rebuilt to introduce new volume support. Afterward, the decision was to expose the volume interface so that external parties can create their own volume plugins and expose them to Kubernetes for usage.

If you don't know what I am talking about let me explain it briefly. For example, if the decision was made to use persistent storage in the pod with the definition of the YAML that uses the PV, PVC, volume, and volumeMounts for example. Magically the container has the volume mounted in the specified directory. Behind the scenes, Kubernetes employed the machinery for you to provide the ability to mount this volume, use it and unmount it.

Which part of machinery was employed for the job, depends on the type of volume used. So all this machinery was part of the Kubernetes source code. It was externalized now and you have CSI drivers. These externalized CSI drivers need to know how to talk to the Kubernetes.

CSI drivers

Container Storage Interface (CSI) defines a standard interface for container orchestration systems (like Kubernetes) to expose arbitrary storage systems to their container workloads.
- https://kubernetes.io/docs/concepts/storage/volumes/#out-of-tree-volume-plugins

CSI drivers when deployed on the Kubernetes can be used to create persistent storage using storage from the provider of the CSI. Examples are Azure disks using managed disks or Azure file storage using NFS or SMB protocols.

When the CSI driver is deployed on Kubernetes users can attach the exposed volumes via:

  • PersistentVolumeClaim
  • Generic ephemeral volume
  • CSI ephemeral volume

To use the CSI driver in the definition of the PersistentVolume next fields are available for Kubernetes administrators:

  • driver
  • volumeHandle
  • readOnly
  • fsType
  • volumeAttributes
  • controllerPublishSecretRef
  • nodePublishSecretRef
  • nodeStageSecretRef

The developers who are interested in developing third-party CSI drivers can check out this documentation. But for understanding what needs to be done from the implementation side of the CSI driver - it is written in the docs.

Specifically, the following is dictated by Kubernetes regarding CSI.

Kubelet to CSI Driver Communication

  • Kubelet directly issues CSI calls (like NodeStageVolume, NodePublishVolume, etc.) to CSI drivers via a Unix Domain Socket to mount and unmount volumes.
  • Kubelet discovers CSI drivers (and the Unix Domain Socket to use to interact with a CSI driver) via the kubelet plugin registration mechanism.
  • Therefore, all CSI drivers deployed on Kubernetes MUST register themselves using the kubelet plugin registration mechanism on each supported node.

Master to CSI Driver Communication

  • Kubernetes master components do not communicate directly (via a Unix Domain Socket or otherwise) with CSI drivers.
  • Kubernetes master components interact only with the Kubernetes API.
  • Therefore, CSI drivers that require operations that depend on the Kubernetes API (like volume create, volume attach, volume snapshot, etc.) MUST watch the Kubernetes API and trigger the appropriate CSI operations against it.

Implementing the CSI driver requires handling the control-plane and also the kubelet <-> CSI driver communication.

If you wish to dig deeper into the concepts presented here I am suggesting the next links for deepening your knowledge: