CSI stands for Container Storage Interface. It is an initiative to combine the storage interface of Container Orchestrator Systems(COS) such as Mesos, Kubernetes, Docker Swarm, etc. with storage vendors like Ceph, Net App (Network Appliance Inc. is a hybrid cloud data services and data management Company.), etc. This means Implementing a single CSI for storage vendors, which is definitely with all COS. Kubernetes 1.9 introduces an alpha implementation of the Container Storage Interface, which makes installing new volume plugins easy. Before Kubernetes 1.9, we needed to add to the core Kubernetes codebase for third-party storage providers to develop solutions, but after Kubernetes 1.9 third-party storage providers to develop solutions without the need of adding core Kubernetes codebase. At present, the CSI 0.2 version was introduced in March 2018.
Essential Terminologies related to Container Storage Interface (CSI)
- COS – Container Orchestrator Systems is all about managing the life cycle of Containers, especially in large dynamic environments. It is used to control and automate many tasks by software teams.
- Volume – A unit of space that will be made available inside managed containers via the CSI.
- Node – A host where a workload will be running.
- Workload – The atomic unit of “work” anticipated by a COS.
- Plugin – Plugin points to a service that exposes gRPC endpoints.
- gRPC – It is an open-source Remote Procedure Call System initially developed at Google.
- SP – Storage Provider, the vendor of CSI plugin implementation.
Container Storage Interface Goals
- Interoperability –The storage vendor can build a plugin, and that plugin can be used by all the COs that support CSI.
- Vendor-neutral – Any CO and storage provider should not terminate it.
- Focus on specification
- Control plane only
Before Container Storage Interface (CSI)
Before CSI was introduced, different COS have their interface that the storage vendor has to implement so that the COS can talk to that SP during the life cycle of volume. The First release of the CSI version 0.1 was introduced in December 2017. In the case of Kubernetes, volume plugins were serving the storage need for container workload.
The first problem was in Kubernetes itself, you need to write a driver as a source vendor and put it inside the Kubernetes code, and what that meant is that as a storage vendor you could not fix any issues, which you have to wait for Kubernetes release you are bound for release process of Kubernetes.
What happens is that each one of these COS would then design their method and say: “This is our API.” So as a storage product as a storage system, you have to write drivers for three different systems (Mesos, Kubernetes, Docker Swarm).
Developers who developed plugins are forced to make plugin source code available.
Object Storage – Object storage is also called object-based storage. Which describes an approach to addressing and manipulating discrete units of storage called objects like files objects contain data, but unlike files, objects are not organised in a hierarchy that means every object exists at the same level in a flat address space called the storage pool and one object can not be placed inside another object. An object is designed for unstructured data such as media, documents, logs, backup, etc. The most popular cloud object storage is AWS S3. Useful for automating and streamlining data storage.
Use cases –
- Storage for unstructured data like music, image, etc.
- Storage for backup files, log files.
- Hybrid cloud storage.
- Disaster Recovery.
Benefits of object storage
- Scalability: Adding data forever, there is no limit. Security and compliance
- Flexible management
Drawbacks – They are designed to read and write entire files, not for a small piece of files. It can not guarantee that each request will always get the latest version of the file; that means if any application updates, it’s not necessary that all the locations where the application is installed get notification regarding the update.
Block Storage – This kind of storage system is usually located close to the server in the same data centre, and it organises data in separate volumes, accessed by one or very few servers simultaneously. The volume has the layout of local hard disks, presented in the form of sectors and tracks. The most common access protocols are FC and iSCSI, and all of the communication happens on a dedicated Storage area network based on lossless Ethernet or FC equipment. This type of storage is perfect for databases and virtual machines, and more in general for all those workloads that require low latency and high IOPS. Unfortunately, Block storage systems are also characterised by a very high cost per gigabyte, and most of the implementations in the market still rely on scale-up dual-controller designs, with a total capacity that can hardly reach the petabyte level, while maintaining the right performance consistency.
Use cases –
- Structured database storage.
- Application using server-side processing.
Drawbacks – The limited scalability drives up complexity and costs in large-capacity scenarios. That means complex to manage at scale.
Before CSI, all COS had their way to handle storage. Docker has DVDI, Kubernetes has flex volume. These implementations have their design criteria, since compatible APIs, applications with various degrees of reliability and quality.
The problem with current interfaces are as follows –
- CLI based interface
- Lack of idempotency on APIs
- In-tree interface
Three core RPC Services –
- Identity Services
- Controller Services
- Node Services
Identity Services – Identity Services gives basic information about plugins like what is your name? What are your capabilities? And we are going to probe you and to see if you are healthy or not, and if every single plug-in needs to implement these three costs, its original cost
rpc GetPluginInfo ( … ) …
rpc GetPluginCapabilities ( … ) …
rpc Probe ( … ) …
- GetPluginInfo() – This method returns the name and version of the plugin.
- GetPluginCapabilities() – It returns the capabilities of the plugin. If this method returns the capability of the plugin, then CO calls the Controller method.
- Probe() – This is called by CO to check if the plugin is running or not.
Controller Services – Controller Services are responsible for controlling and managing the volumes, such as: creating, deleting, mounting, unmounting, and creating snapshots, etc.
- CreateVolume() – This method takes an argument in terms of createvolumerequest and returns createvolumeresponse.
- DeleteVolume() – This method deletes a volume that was previously created.
- ControllerPublishVolume() – This method is used to make a volume available on some required nodes.
- ControllerUnpublishVolume() – This method is used to make a volume unavailable on a specific node.
- ValidateVolumeCapabilities() – This method is used to return the capabilities of the volume.
- ListVolumes() – This method is used to return a list of all the available volumes.
- GetCapacity() – This method returns the capacity of the total available storage pool. And this method is used when we have limited storage capacity.
- ControllerGetCapabilities() – This method returns the capabilities of the Controller plugin. Node Services: Node Services are responsible for controlling volume action in the node.
NodeStageVolume – This method is called by the CO to mount the volume to a path temporarily. But in Kubernetes, two steps are needed. First, it’s mounted to the global directory then it into the pod directory because, in Kubernetes, a single volume is used by multiple pods.
- NodeUnstageVolume() – This method is used to unmount the volume from the staging path.
- NodePublishVolume() – This method is used to mount the volume from staging to the target path.
- NodeUnpublishVolume() – This method is used to unmount the volume from the target path.
- NodeGetId() – This method returns a unique ID of the node on which this plugin is running.
- NodeGetCapabilities() – This method returns the capabilities of the node.
Volume Life Cycle Container Orchestrator
Initially, the volume is not created CO (Container Orchestrator) will dynamically provision the amount by using Create Volume and now the amount is in create state and then before the size can be used by a container on a node the CO will call ControllerPublishVolume to make an amount available on some required node. Once the ControllerPublish is successful, the CO calls the NodeStageVolume, which is the operation for each volume on a node and then calls NodePublishVolume for each part.
Frequently Asked Questions
- Choose a partner that cares about its clients.
- Never compromise on technology experience and domain expertise.
- Check out your development partners’ portfolios, customer testimonials, and references.
- Observe how they approach communication and how much they pay attention to your vision.
- Ask the right questions to help you choose easily.
- The average outsourcing charges in India are $18 – $40, which is way more affordable than in developed countries like the USA, $38 – $63.
- India has a large pool of native-English speakers who’re highly proficient in their work.
- With an Indian outsourcing partner, you can access 24×7 support and specialized IT talent.