CSI : Container Storage Interface and how to go about it

Ahmedulde
5 min readJan 20, 2021

--

Kubernetes announce container storage interface (CSI) in general availability in the Kubernetes v1.13 release in the year 2019. Since then CSI has opened up the possibility of a true cloud native storage and k8s users did not have to worry about volume plugins anymore. Before the advent of CSI, k8s provided a powerful volume plugin system which were “in-tree” meaning their code was part of the core Kubernetes code and shipped with the core Kubernetes binaries — vendors wanting to add support for their storage system to Kubernetes (or even fix a bug in an existing volume plugin) were forced to align with the Kubernetes release process. Various vendors and contributors came together to build the container storage interface to solve the issue.

There are different container orchestrators like kubernetes, openshift, mesosphere, docker etc. and also there are different storage providers like Amazon EBS, Azure Disks, Azure File, GCP Persistent Disk, Ceph Storage, Portworx etc. To separate these storage providers from the container workloads we can make use of container storage interface in between. CSI acts as the control plane to the actual storage a.k.a data plane. Using CSI you can control storage by create/delete volumes, mount/unmount volumes, create snapshots, perform backups and recovery etc.

Some important design choices for CSI were to use out of tree structure vs in-tree to avoid interference with the release cycles for kubernetes. Also a service based design rather than cli based was chosen and it was decided to have two sets of api for controller and node services. Every api was decided to be idempotent for better failure recovery. Also these API’s were synchronous and use gRPC for the communication as it better supported the overall goal.

Architecture Diagram:

How CSI Plugin work is shown in the above diagram

You need to extend these interfaces to achieve a higher goal of being able to run and manage your storage at hyper-scale level. Thats where use of other vendors and CNCF projects become important. If possible, always make an attempt to remain vendor neutral at the storage level we must try to use open source and community driven projects or you can also architect in such a way that vendors are not bottlenecks in the architecture.

CNCF hosted projects that can be used to further extend CSI capabilities are mentioned below:

Rook consists of 3 architectural layers — operator, Storage Provisioning and Data Layer. Rook Operator owns the management of the storage provider, storage provisioning layer is where CSI driver connects client pods to the storage itself and the data layer is the storage provider or the storage daemons like block, file, object, databases etc. This allows you to run different storage providers independently in the same cluster. Currently rook supports Ceph, EdgeFS (deprecated), Cassandra, CockroachDB, NFS, YugabyteFS and apache Ozone

Another example is OpenEBS (supported by mayadata), that allows to create a storage pool (cStor Pool) and workloads can claim storage from this pool. its mainly focused for SRE, storage admins and app developers. You can use openEBS for use cases like stateful apps, CI/CD, deployments and other integrations, multi-cloud, hyper scale use cases. This projects aims at turning your kubernetes into a data plane and future release plans towards chaos engineering, UI, cross-site management etc. OpenEBS supports velero plugin so you can leverage velero capabilities for backup and recovery.

Longhorn (supported by rancher) is a lightweight, reliable and easy-to-use distributed block storage system for Kubernetes. With Longhorn you can:

  • Use Longhorn volumes as persistent storage for the distributed stateful applications in your Kubernetes cluster
  • Partition your block storage into Longhorn volumes so that you can use Kubernetes volumes with or without a cloud provider
  • Replicate block storage across multiple nodes and data centers to increase availability
  • Store backup data in external storage such as NFS or AWS S3
  • Create cross-cluster disaster recovery volumes so that data from a primary Kubernetes cluster can be quickly recovered from backup in a second Kubernetes cluster
  • Schedule recurring snapshots of a volume, and schedule recurring backups to NFS or S3-compatible secondary storage
  • Restore volumes from backup
  • Upgrade Longhorn without disrupting persistent volumes
  • Manipulate Longhorn resources with kubectl
  • Longhorn comes with a standalone UI, and can be installed using Helm, kubectl, or the Rancher app catalog.
  • Simplifying Distributed Block Storage with Microservices.
  • Use Persistent Storage in Kubernetes without Relying on a Cloud Provider.
  • Schedule Multiple Replicas across Multiple Compute or Storage Hosts

ChubaoFS consists of a metadata subsystem, a data subsystem, and a resource manager, and can be accessed by different clients (as a set of application processes) hosted on the containers through different file system instances called volumes. Some notable features for chubaoFS would be:

  • scalable meta mangement
  • general purpose storage engine
  • strong replication consistency
  • relaxed posix semantics and metadata atomicity

MinIO is a High Performance Object Storage released under Apache License v2.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads. But if you would like to run your application across various cloud providers use of openEBS would make more sense and openEBS can also help you run minio if your application state still needs it.

Velero (also part of VMware Tanzu) is an open source tool to safely backup and restore, perform disaster recovery, and migrate kubernetes cluster resources and persistent volumes from one cluster to another cluster and replicate your production cluster to development and testing clusters.

Other solutions can be object based storage is Manta: Triton’s object storage and converged analytics solution or GlusterFS is a scalable network filesystem suitable for data-intensive tasks such as cloud storage and media streaming. You can read more about it from the references. Some other notable notable vendors like storageOS and Trilio are providing solutions for daily storage operations issues.

There is no silver bullet to answer all situations or cover all use-cases but this blog should definitely give you an idea on how should think about your storage solution for running your workload on kubernetes.

References:

https://docs.gluster.org/en/latest/

https://kubernetes.io/blog/2018/01/introducing-container-storage-interface/

https://docs.google.com/document/d/1ayeALoU5jrO5x96N7bqXmLx0O-rAIh2HllZBgtYwz3Q/edit#

--

--

Ahmedulde
Ahmedulde

Written by Ahmedulde

Trusted Advisor, Mentor — Cloud and Devops SME

No responses yet