Motivation for volumes
Containers in a pod share the same network stack but each has its own file system. It can be useful to share data between containers, for example having an initcontainer prepare some files that the main container depends on. The file system of containers also are limited to the lifetime of the container. This can present undesirable side effects. For example if the data tier container we are using in our examples crashes or fails a liveness probe it will be restarted and all of the data it had been storing will be lost forever.
In this post, I will cover different ways Kubernetes handles non-ephemeral data, allowing us to separate data from containers. We will see Kubernetes Volumes, and Kubernetes PersistentVolumes. By the end of this post, our goal is to deploy the data-tier for our sample application using
PersistentVolumes so that the data can outlive the data-tier pod. Again, this post builds on the code from the previous posts specifically the deployments post. Let’s first discuss more about the options for storing persistent data and then apply them to our data tier.
Kubernetes includes two different data storage types. Both are used by mounting a directory in a container and can be shared by containers in the same pod. Pods can also use more than one Volume and PersistentVolume. Their differences are mainly in how their lifetime is managed. One type exists for the lifetime of a particular pod, and the other is independent from the lifetime of pods.
Volumes are tied to a pod and their lifecycle. Volumes are used to share data between containers in a pod and to tolerate container restarts. Although you can configure volumes to use durable storage types that survive pod deletion, you should consider using volumes for non-durable storage that is deleted when the pod is deleted.
The default type of volume, is called
empty_dir, and it creates an initially empty directory on the node running the pod to back the storage used by the volume. Any data written to the directory remains if a container in the pod is restarted. Once the pod is deleted the data in the volume is permanently deleted.
It’s worth noting that since the data is stored on a specific node, if a pod is rescheduled to a different node, the data will be lost. If the data is too valuable to lose when a pod is deleted or rescheduled, you should consider using
PersistentVolumes are independent from the lifetime of pods and is separately managed by Kubernetes. They work a little bit differently than volumes.
Pods may claim a persistent volume, and use it throughout their lifetime.
PersistentVolumes will continue to exist outside of their pods. Persistent volumes can even be mounted by multiple pods on different nodes -if the underlying storage supports multiple readers or writers.
Persistent volumes can be provisioned statically in advance by a cluster admin or dynamically for more flexible self-serve use cases.
PersistentVolume Claims or PVC
Pods must make a request for storage before they can use a persistent volume. The request is made using a persistent volume claim or pvc. A PVC declares how much storage the pod needs, the type of persistent volume, and the access mode. The access mode describes how the persistent volume is mounted whether it is read only or read write and if it can be mounted by one node or many. There are three supported access modes to choose from: read-write once, read-only many, or read-write many. If there isn’t a persistent volume available to satisfy the claim and dynamic provisioning isn’t enabled, the claim will stay in a pending state until such a persistent volume is available.
The persistent volume claim is connected to a Pod by using a regular volume with the type set to persistent volume claim.
Storage Volume types
Both volumes and PersistentVolumes may be backed by a wide variety of volume types. As we learned before, it is usually preferable to use persistent volumes for more durable types and volumes for more ephemeral storage needs. Durable volume types include the persistent disks of many cloud vendors such as Google Cloud engine persistent disks, Azure Disks, and Amazon elastic block store. There’s also support for more generic volume types such as network file system or NFS, and iSCSI.
That is quite a lot to take in but everything should solidify with an example. Our objective is to use a PersistentVolume for the sample applications data-tier since we want the data to outlive its pod. In our example the cluster has an Amazon elastic block store volume statically provisioned and ready for us to use.
To see dynamic provisioning in action I will cover this in another post: “Deploy a Stateful Application in a Kubernetes Cluster”.
What is the issue we are trying to address?
Before we get into volumes I want to cement the issue we are trying to solve. We can illustrate the issue of pod containers losing their data when they restart by forcing a restart of the data tier pod. First of all, let’s look at the counter that will be running after we create the 3-tier application from deployments post.
ubuntu@ip-10-0-128-5:~# kubectl create -f 5.1-namespace.yaml namespace/deployments created ubuntu@ip-10-0-128-5:~/src# kubectl create -f 5.2-data_tier.yaml -f 5.3-app_tier.yaml -f 5.4-support_tier.yaml -n deployments service/data-tier created deployment.apps/data-tier created service/app-tier created deployment.apps/app-tier created deployment.apps/support-tier created ubuntu@ip-10-0-128-5:~/src# kubectl get -n deployments deployments. NAME READY UP-TO-DATE AVAILABLE AGE app-tier 1/1 1 1 14s data-tier 1/1 1 1 14s support-tier 1/1 1 1 14s ubuntu@ip-10-0-128-5:~/src#
Check the counter logs using
kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1
ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1 Current counter: 1350 ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1 Current counter: 1370 ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1 Current counter: 1386 ubuntu@ip-10-0-128-5:~/src#
Sure enough the counter is getting incremented.
Kill the container to emulate pod restart
Now if I force the pod to be restarted we can observe the impact on the counter. One way to do that is to kill the redis process which will cause the data-tier container to exit and the data-tier pod will automatically restart it. We can use the exec command allows us to run a command inside of a container, the same way docker exec does. Let’s open a bash shell inside the container:
kubectl exec -n deployments data-tier-599bc4fcf8-p5d86 -it /bin/bash
ubuntu@ip-10-0-128-5:~/src# kubectl exec -n deployments data-tier-599bc4fcf8-p5d86 -it /bin/bash root@data-tier-599bc4fcf8-p5d86:/data#
The change of command prompt tells us we are in the container now. We can now use the kill command to stop the main process of the container. But what is the ID of the process? The ID of the main process, which is redis in this case, will always be one since it is the first process that runs in the container.
root@data-tier-599bc4fcf8-p5d86:/data# kill 1 root@data-tier-599bc4fcf8-p5d86:/data# command terminated with exit code 137 ubuntu@ip-10-0-128-5:~/src# kubectl get -n deployments deployments. NAME READY UP-TO-DATE AVAILABLE AGE app-tier 1/1 1 1 10m data-tier 1/1 1 1 10m support-tier 1/1 1 1 10m ubuntu@ip-10-0-128-5:~/src# ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments get pods NAME READY STATUS RESTARTS AGE app-tier-748cdbdcc5-fjpcv 1/1 Running 1 16m data-tier-599bc4fcf8-p5d86 1/1 Running 1 16m support-tier-58d5d545b6-clltf 2/2 Running 0 16m ubuntu@ip-10-0-128-5:~/src# ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1 Current counter: 114 ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1 Current counter: 133 ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller --tail 1 Current counter: 147 ubuntu@ip-10-0-128-5:~/src# kubectl -n deployments logs support-tier-58d5d545b6-clltf poller ... Current counter: 3561 Current counter: 3571 Current counter: 3587 Current counter: 3604 Current counter: Current counter: Current counter: 6 Current counter: 11 Current counter: 23 Current counter: 35 Current counter: 53 Current counter: 61 Current counter: 75 Current counter: 87 Current counter: 92 Current counter: 98
The output tells us that yes there has been a restart of the pod. Also the counter value through the poller logs reveals that it was reset when the pod restarted. This is what we want to avoid.
Create a new namespace called volumes
Let’s start by creating a new volumes namespace
ubuntu@ip-10-0-128-5:~/src# kubectl create -f 9.1-namespace.yaml namespace/volumes created ubuntu@ip-10-0-128-5:~/src#
Now, on to the data tier. There are three additions to the manifest:
- a persistent volume,
- a persistent volume claim, and
- a volume to connect the claim to the pod.
apiVersion: v1 kind: Service metadata: name: data-tier labels: app: microservices spec: ports: - port: 6379 protocol: TCP # default name: redis # optional when only 1 port selector: tier: data type: ClusterIP # default --- apiVersion: v1 kind: PersistentVolume metadata: name: data-tier-volume spec: capacity: storage: 1Gi # 1 gibibyte accessModes: - ReadWriteOnce awsElasticBlockStore: volumeID: INSERT_VOLUME_ID # replace with actual ID --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: data-tier-volume-claim spec: accessModes: - ReadWriteOnce resources: requests: storage: 128Mi # 128 mebibytes --- apiVersion: apps/v1 # apps API group kind: Deployment metadata: name: data-tier labels: app: microservices tier: data spec: replicas: 1 selector: matchLabels: tier: data template: metadata: labels: app: microservices tier: data spec: # Pod spec containers: - name: redis image: redis:latest imagePullPolicy: IfNotPresent ports: - containerPort: 6379 name: redis livenessProbe: tcpSocket: port: redis # named port initialDelaySeconds: 15 readinessProbe: exec: command: - redis-cli - ping initialDelaySeconds: 5 volumeMounts: - mountPath: /data name: data-tier-volume volumes: - name: data-tier-volume persistentVolumeClaim: claimName: data-tier-volume-claim
First, is the PersistentVolume. It is the raw storage where data is ultimately written to by the pod’s container. It has a declared storage capacity and other attributes. Here, we’ve allocated 1 gibibyte. The access mode of ReadWriteOnce means this volume may be mounted for reading and writing by a single node at a time. Note that it is a limit on node attachment and not pod attachment. PersistentVolumes may list multiple access modes and the claim specifies the mode it requires. The persistent volume can only be claimed in a single accessmode at any time. Lastly we have an awsElasticBlockStore mapping which is specific to the type of storage backing the PV. You would use a different mapping if you were not using an EBS volumefor storage. The only required key for aws elastic block store is the volume ID which uniquely identifies the EBS volume. It will be different in your environment than mine so I’ve added an insert volume id placeholder that we will replace before we create the PV.
Persistent Volume Claim (PVC)
Next we have the persistent volume claim. The PVC spec outlines what it is looking for in a PV. For a PV to be bound to a PVC, it must satisfy all of the constraints in the claim. We are looking for a PV that provides the read-write once access mode and has at least 128 mebibytes of storage. The claim request is less than or equal to the persistent volumes capacity and the access mode overlaps with the available access modes in the PV. This means the PVC request is satisfied by our PV and will be bound to it.
Volumes in Deployment’s pod template
Lastly, the deployments template now includes a volume which links the PVC to the deployment’s pod. This is accomplished by using the persistentvolume claim mapping and setting the claim name to the name of the pvc which is data tier volume claim. You will always use persistent volume claim when working with PVs. If you wanted to use an ephemeral storage volume you would replace it with an emptyDir mapping or other types that don’t connect to a PV.
Volume can be used in the pod’s containers and init containers but they must be mounted to be available in the containers. The volume mounts list includes all the volume mounts for a given container. The mountPaths for different containers can be different even if the volume is the same. In our case we only have one and we are mounting the volume at /data which is where redis is configured to store its data. This will cause all of the data to be written to the PV.
VolumeID placeholder for EBS
Now we are left with replacing the volume ID placeholder with the actual ID of the Amazon EBS volume the lab environment created for us. You could get it from the EC2 console in your browser but we’ll use the AWS CLI for this example. The volume can be obtained from the aws ec2 describe command.
aws ec2 describe-volumes --region=us-west-2 --filters="Name=tag:Type,Values=PV" --query="Volumes.VolumeId" --output=text
ubuntu@ip-10-0-128-5:~/src# aws ec2 describe-volumes --region=us-west-2 --filters="Name=tag:Type,Values=PV" --query="Volumes.VolumeId" --output=text vol-09bc5324eb947dcdb ubuntu@ip-10-0-128-5:~/src# vol_id=$(aws ec2 describe-volumes --region=us-west-2 --filters="Name=tag:Type,Values=PV" --query="Volumes.VolumeId" --output=text) ubuntu@ip-10-0-128-5:~/src# sed -i "s/INSERT_VOLUME_ID/$vol_id/" 9.2-pv_data_tier.yaml ubuntu@ip-10-0-128-5:~/src#
The filter selects only the PV volume which is labeled with a Type = PV tag and the query outputs only the volume ID property of the volume. I’ll store the id in a variable named
vol_id. Then we can use stream editor or sed to substitute the the occurrence of
INSERT_VOLUME_ID with the volume ID stored in
Create the data-tier
And with that we are ready to create the data-tier using a persistent volume. We’ll also create the app and support tiers which don’t have anything new compared to previous versions.
ubuntu@ip-10-0-128-5:~/src# kubectl create -n volumes -f 9.2-pv_data_tier.yaml -f 9.3-app_tier.yaml -f 9.4-support_tier.yaml service/data-tier created persistentvolume/data-tier-volume created persistentvolumeclaim/data-tier-volume-claim created deployment.apps/data-tier created service/app-tier created deployment.apps/app-tier created deployment.apps/support-tier created ubuntu@ip-10-0-128-5:~/src#
Describe the PVC
Let’s get the persistent volume claim which has the short name of pvc in kubectl to confirm the claim’s request is satisfied by the PV
ubuntu@ip-10-0-128-5:~/src# kubectl describe -n volumes pvc Name: data-tier-volume-claim Namespace: volumes StorageClass: gp2 Status: Bound Volume: pvc-7eda5dd0-38e6-46a5-a8e5-f4e2a098f4d3 Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs Finalizers: [kubernetes.io/pvc-protection] Capacity: 1Gi Access Modes: RWO VolumeMode: Filesystem Mounted By: data-tier-8689f7ffc-nk8h4 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ProvisioningSucceeded 40s persistentvolume-controller Successfully provisioned volume pvc-7eda5dd0-38e6-46a5-a8e5-f4e2a098f4d3 using kubernetes.io/aws-ebs ubuntu@ip-10-0-128-5:~/src#
The Status of
Bound confirms that the PVC is bound to the PV.
Describe the data-tier pod to check the event logs
Now if we describe the data-tier pod
ubuntu@ip-10-0-128-5:~/src# kubectl describe -n volumes pod data-tier-8689f7ffc-nk8h4 Name: data-tier-8689f7ffc-nk8h4 Namespace: volumes Priority: 0 Node: ip-10-0-27-39.us-west-2.compute.internal/10.0.27.39 Start Time: Tue, 05 May 2020 23:37:44 +0000 Labels: app=microservices pod-template-hash=8689f7ffc tier=data Annotations: <none> Status: Running IP: 192.168.95.67 Controlled By: ReplicaSet/data-tier-8689f7ffc Containers: redis: Container ID: docker://4f284571569b06d4ef337030f401856b30b74b83ad3b06449ec38514f3e6223a Image: redis:latest Image ID: docker-pullable://redis@sha256:f7ee67d8d9050357a6ea362e2a7e8b65a6823d9b612bc430d057416788ef6df9 Port: 6379/TCP Host Port: 0/TCP State: Running Started: Tue, 05 May 2020 23:37:59 +0000 Ready: True Restart Count: 0 Liveness: tcp-socket :redis delay=15s timeout=1s period=10s #success=1 #failure=3 Readiness: exec [redis-cli ping] delay=5s timeout=1s period=10s #success=1 #failure=3 Environment: <none> Mounts: /data from data-tier-volume (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-jpwt8 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: data-tier-volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: data-tier-volume-claim ReadOnly: false default-token-jpwt8: Type: Secret (a volume populated by a Secret) SecretName: default-token-jpwt8 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m18s (x3 over 2m20s) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 2 times) Normal Scheduled 2m15s default-scheduler Successfully assigned volumes/data-tier-8689f7ffc-nk8h4 to ip-10-0-27-39.us-west-2.compute.internal Normal SuccessfulAttachVolume 2m13s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-7eda5dd0-38e6-46a5-a8e5-f4e2a098f4d3" Normal Pulling 2m5s kubelet, ip-10-0-27-39.us-west-2.compute.internal Pulling image "redis:latest" Normal Pulled 2m kubelet, ip-10-0-27-39.us-west-2.compute.internal Successfully pulled image "redis:latest" Normal Created 2m kubelet, ip-10-0-27-39.us-west-2.compute.internal Created container redis Normal Started 2m kubelet, ip-10-0-27-39.us-west-2.compute.internal Started container redis ubuntu@ip-10-0-128-5:~/src#
We can see the pod initially failed to schedule because the claim needs to wait awhile before it is bound to the PV. Once it is bound the pod is scheduled and we can see the
Delete the data-tier pod
Not only can our new design tolerate a data tier pod container restart, but the data will persist even if we delete the entire data tier deployment which will delete the data tier pod and prevent any new pods from being created. (Recall we are not killing the redis process here, which in turn will kill the container, and cause the deployment pod to restart the container).
If everything goes to plan we should be able to recover the redis data if we then replace the deployment. That is because the deployment template is configured to use the same PVC and the PVC is still bound to the PV storing the original redis data. Let’s verify all of this.
Before we delete the data-tier deployment lets get the last log line from the poller to see where our counter is at
ubuntu@ip-10-0-128-5:~/src# kubectl logs -n volumes support-tier-687789db8-45d5b poller --tail 1 Current counter: 750
If we delete the deployment and then replace it we should see a number higher than this if the data is persisted. Let’s do that. Delete the data tier deployment, and confirm that there no data tier pods running.
ubuntu@ip-10-0-128-5:~/src# kubectl delete -n volumes deployments. data-tier deployment.extensions "data-tier" deleted ubuntu@ip-10-0-128-5:~/src# kubectl get -n volumes pods NAME READY STATUS RESTARTS AGE app-tier-6bf4d544c-v7m4l 1/1 Running 0 8m1s support-tier-687789db8-45d5b 2/2 Running 0 8m1s ubuntu@ip-10-0-128-5:~/src#
Re-create the data-tier pod
Now recreate the data tier deployment
ubuntu@ip-10-0-128-5:~/src# kubectl create -f 9.2-pv_data_tier.yaml -n volumes deployment.apps/data-tier created Error from server (AlreadyExists): error when creating "9.2-pv_data_tier.yaml": services "data-tier" already exists Error from server (AlreadyExists): error when creating "9.2-pv_data_tier.yaml": persistentvolumes "data-tier-volume" already exists Error from server (AlreadyExists): error when creating "9.2-pv_data_tier.yaml": persistentvolumeclaims "data-tier-volume-claim" already exists ubuntu@ip-10-0-128-5:~/src#
Create tells us everything except the deployment already exists and only the deployment was created. Now it takes a couple minutes for all of the readiness checks to start passing again and for some old connections to time out. This is mainly a side effect of the example application not being particularly good at handling this situation and not because of delays intrinsic to Kuberentes. The fact that Kubernetes can self heal the application is a testament to kubernetes abilities.
After a minute or two we can get the poller’s last log
ubuntu@ip-10-0-128-5:~/src# kubectl logs -n volumes support-tier-687789db8-45d5b poller --tail 1 Current counter: 1360 ubuntu@ip-10-0-128-5:~/src# kubectl logs -n volumes support-tier-687789db8-45d5b poller Current counter: 1159 Current counter: 1176 Current counter: 1183 Current counter: Current counter: Current counter: 1208 Current counter: 1224 Current counter: 1239 Current counter: 1251 Current counter: 1259 Current counter: 1268 Current counter: 1280 Current counter: 1291 Current counter: 1296 Current counter: 1304 Current counter: 1315 Current counter: 1328 Current counter: 1341 Current counter: 1355 Current counter: 1360 Current counter: 1369
And voila, the counter has kept on ticking upward from where we left off before deleting the deployment. Our persistent volume has lived up to its name.
This concludes our lesson on volumes. We’ve covered volumes, PersistentVolumes, and PersistentVolumeClaims. In our example We’ve shown how to use a persistent volume to avoid data loss by keeping the data independent from the lifecycle of the pod or the pod’s volume. We also saw how kubectl exec allows us to run commands in existing containers when we demonstrated how container restarts cause data loss when volumes aren’t used. We now how a solid foundation for volumes and persistent volumes.