Persistent Storage: Volumes, PVs, and PVCs

Kubernetes applications often need persistent storage beyond the ephemeral lifecycle of Pods. Containers are stateless by default, meaning data is lost when a Pod restarts. To solve this, Kubernetes provides Volumes, PersistentVolumes (PVs), and PersistentVolumeClaims (PVCs). These components ensure data durability, portability, and scalability across clusters.

Volumes

A Volume in Kubernetes is a directory accessible to containers in a Pod. Unlike container storage, volumes persist data across container restarts within the same Pod.

Types of Volumes

emptyDir: Temporary storage created when a Pod starts and deleted when the Pod ends.
hostPath: Maps a file or directory from the host node into the Pod.
configMap/secret: Provide configuration and sensitive data as volumes.
PersistentVolume: Abstracts storage from the underlying infrastructure.

YAML Example: Volume

apiVersion: v1
kind: Pod
metadata:
  name: volume-demo
spec:
  containers:
  - name: demo-container
    image: nginx
    volumeMounts:
    - name: demo-volume
      mountPath: /usr/share/nginx/html
  volumes:
  - name: demo-volume
    emptyDir: {}

Explanation: This Pod uses an emptyDir volume to store temporary data.

PersistentVolumes (PVs)

A PersistentVolume is a cluster-wide resource that represents physical storage. PVs abstract storage from the underlying infrastructure, whether it’s local disk, NFS, or cloud storage (AWS EBS, Azure Disk, GCP Persistent Disk).

YAML Example: PersistentVolume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-demo
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /mnt/data

Explanation: This PV provides 1Gi of storage using the host’s /mnt/data directory.

PersistentVolumeClaims (PVCs)

A PersistentVolumeClaim is a request for storage by a user. PVCs bind to available PVs that match their requirements (size, access modes).

YAML Example: PersistentVolumeClaim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-demo
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Explanation: This PVC requests 1Gi of storage. Kubernetes binds it to a matching PV.

Using PVCs in Pods

apiVersion: v1
kind: Pod
metadata:
  name: pvc-pod
spec:
  containers:
  - name: demo-container
    image: nginx
    volumeMounts:
    - name: pvc-storage
      mountPath: /usr/share/nginx/html
  volumes:
  - name: pvc-storage
    persistentVolumeClaim:
      claimName: pvc-demo

Explanation: The Pod mounts the PVC pvc-demo at /usr/share/nginx/html.

Flowchart: Storage Workflow


   Admin defines PV ---> Cluster stores PV ---> Available for claims
          |
          v
   Developer creates PVC ---> Scheduler binds PVC to PV
          |
          v
   Pod references PVC ---> Application uses persistent storage

Real-Time Example

In a blogging platform:

Volumes: Used for temporary cache storage.
PVs: Provide durable storage for blog posts and media files.
PVCs: Developers request storage for each microservice (e.g., image service, content service).

Common Mistakes

Not defining storage classes, leading to manual PV-PVC binding issues.
Using hostPath in production, which ties Pods to specific nodes.
Ignoring access modes, causing Pods to fail when multiple replicas need storage.
Not monitoring PV usage, leading to resource exhaustion.

Interview Notes

Q1: Difference between PV and PVC?

Answer: PV is the actual storage resource, while PVC is a request for storage by a Pod.

Q2: How does Kubernetes bind PVs and PVCs?

Answer: The control plane matches PVC requests with available PVs based on size and access modes.

Q3: What are access modes in PVs?

Answer: Access modes define how storage can be mounted: ReadWriteOnce, ReadOnlyMany, ReadWriteMany.

Q4: Example Interview Task

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-task
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteMany
  nfs:
    path: /nfs/data
    server: 10.0.0.10

Explanation: This PV uses NFS storage with ReadWriteMany access mode, allowing multiple Pods to share data.

Advanced Notes

StorageClasses: Automate dynamic provisioning of PVs.
Dynamic Provisioning: PVCs automatically create PVs using storage classes.
Cloud Integration: PVs can use AWS EBS, Azure Disks, or GCP Persistent Disks.
Best Practices: Use PVCs for portability, avoid hostPath in production, and monitor storage usage.

Summary

Persistent storage in Kubernetes is managed through Volumes, PersistentVolumes, and PersistentVolumeClaims. Volumes provide storage within Pods, PVs abstract physical storage, and PVCs request storage resources. Together, they ensure applications can store and retrieve data reliably. Mastering these concepts is crucial for building stateful applications and preparing for Kubernetes interviews.