Persistent Storage in Kubernetes: Volumes, PersistentVolumes, and PersistentVolumeClaims with Real-Time Examples
In Kubernetes, Pods are temporary by design. A Pod can be restarted, deleted, rescheduled to another node, or replaced during a deployment. This design is excellent for scalability and self-healing, but it creates one serious problem: what happens to application data when a Pod disappears?
For stateless applications like simple APIs, losing the Pod is usually not a big issue because the application can restart and continue serving requests. But for stateful applications such as databases, file upload services, reporting systems, analytics tools, and banking transaction systems, losing data is unacceptable.
This is why Kubernetes provides Volumes, PersistentVolumes, and PersistentVolumeClaims. Your base article already introduces these core storage concepts. The version below expands it with deeper explanations, real-time examples, production mistakes, diagrams, and interview-friendly clarity. :contentReference[oaicite:0]{index=0}
Why Persistent Storage is Needed in Kubernetes?
Containers and Pods are disposable. Kubernetes can destroy and recreate them anytime to maintain the desired state of the application.
This is powerful for availability, but dangerous for data.
Problem Without Persistent Storage
[ MySQL Pod ]
|
v
Writes data inside container filesystem
|
v
Pod crashes or gets deleted
|
v
Data is lost
This can be a disaster in real applications.
Real-Time Banking Example
Imagine a banking platform where a database stores:
- Customer account balances
- UPI transaction records
- Loan EMI details
- Credit card payment history
- Audit logs
If the database Pod is deleted and data is stored only inside the container, the bank may lose critical financial records. This is not just a technical failure; it can become a legal, financial, and compliance disaster.
Persistent storage ensures that data survives even if Pods are recreated.
Simple Explanation of Kubernetes Storage
Kubernetes separates application execution from data storage.
[ Pod ]
|
v
[ Volume / PVC ]
|
v
[ Persistent Storage ]
The Pod can come and go, but the storage remains available.
What is a Kubernetes Volume?
A Volume in Kubernetes is a storage location attached to a Pod. Containers inside the Pod can read from and write to that volume.
A normal container filesystem is temporary, but a Kubernetes volume can preserve data depending on the volume type.
Volume Example
apiVersion: v1
kind: Pod
metadata:
name: volume-demo
spec:
containers:
- name: nginx-container
image: nginx
volumeMounts:
- name: demo-volume
mountPath: /usr/share/nginx/html
volumes:
- name: demo-volume
emptyDir: {}
How It Works
[ Nginx Container ]
|
v
/usr/share/nginx/html
|
v
[ Kubernetes Volume ]
Here, the container can write files into the mounted directory.
Important Volume Types
| Volume Type | Purpose | Best Use Case |
|---|---|---|
| emptyDir | Temporary Pod storage | Cache, temp files |
| hostPath | Mounts node filesystem path | Testing, node-level tools |
| configMap | Mounts configuration data | Application config files |
| secret | Mounts sensitive data | Passwords, certificates |
| PersistentVolume | Durable storage | Databases, uploads, logs |
emptyDir Volume
emptyDir is created when a Pod starts and deleted when the Pod is removed.
It is useful for temporary data.
Real-Time Example
In an image processing application, one container may download an image, another container may process it, and both containers share temporary files using emptyDir.
[ Download Container ]
|
v
[ emptyDir Volume ]
|
v
[ Image Processing Container ]
This is useful because the temporary file is not needed after processing completes.
hostPath Volume
hostPath mounts a path from the worker node into the Pod.
Example:
volumes:
- name: host-storage
hostPath:
path: /mnt/data
This is useful for testing, but it is risky in production because the Pod becomes dependent on a specific node path.
Why hostPath is Risky in Production?
[ Pod scheduled on Node-1 ]
|
v
Uses /mnt/data
Pod rescheduled to Node-2
|
v
/mnt/data may not exist
This can cause application failure.
What is a PersistentVolume?
A PersistentVolume, also called PV, is a cluster-level storage resource managed by Kubernetes.
A PV represents actual storage such as:
- Local disk
- NFS storage
- AWS EBS volume
- Azure Disk
- Google Persistent Disk
- Storage from CSI drivers
The important point is that PV exists independently of Pods.
PersistentVolume Diagram
[ Kubernetes Cluster ]
|
v
[ PersistentVolume ]
|
v
[ Real Storage System ]
PersistentVolume YAML Example
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-demo
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /mnt/data
Explanation
| Field | Meaning |
|---|---|
| capacity | Total storage size |
| accessModes | How storage can be mounted |
| hostPath | Actual storage path |
What is a PersistentVolumeClaim?
A PersistentVolumeClaim, also called PVC, is a storage request made by an application or developer.
A simple way to understand:
PV is the actual storage. PVC is the request for storage.
Real-World Analogy
Think of a company parking area.
- Parking slots already exist โ PersistentVolumes
- Employee requests one parking slot โ PersistentVolumeClaim
- System assigns matching slot โ PV binds to PVC
PersistentVolumeClaim YAML Example
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
This PVC requests 5Gi storage.
PV and PVC Binding Flow
[ Admin Creates PV ]
|
v
PV Available in Cluster
|
v
[ Developer Creates PVC ]
|
v
Kubernetes Finds Matching PV
|
v
PVC Binds to PV
|
v
Pod Uses PVC
Using PVC in a Pod
apiVersion: v1
kind: Pod
metadata:
name: mysql-pod
spec:
containers:
- name: mysql
image: mysql:8.0
volumeMounts:
- name: mysql-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-storage
persistentVolumeClaim:
claimName: mysql-pvc
This Pod mounts persistent storage at:
/var/lib/mysql
This is where MySQL stores database files.
Real-Time MySQL Example
Suppose an e-commerce platform stores:
- Customer accounts
- Orders
- Payments
- Product inventory
- Invoices
MySQL must use persistent storage.
[ MySQL Pod ]
|
v
/var/lib/mysql
|
v
[ PVC ]
|
v
[ PersistentVolume ]
|
v
[ Cloud Disk / Storage ]
Even if MySQL Pod restarts, order data remains safe.
Access Modes in PersistentVolumes
| Access Mode | Meaning | Common Use Case |
|---|---|---|
| ReadWriteOnce | Mounted as read-write by one node | MySQL, PostgreSQL |
| ReadOnlyMany | Mounted read-only by many nodes | Shared static files |
| ReadWriteMany | Mounted read-write by many nodes | NFS, shared uploads |
| ReadWriteOncePod | Mounted by only one Pod | Strict single-writer workloads |
Real-Time Healthcare Example
In a healthcare platform, doctors may upload:
- Patient reports
- Scan images
- Prescription PDFs
- Insurance documents
These files cannot disappear when Pods restart.
A shared persistent volume can store these documents safely.
What is StorageClass?
A StorageClass defines how Kubernetes should dynamically provision storage.
Without StorageClass, administrators may need to manually create PVs.
With StorageClass, Kubernetes can automatically create storage when a PVC is created.
Dynamic Provisioning Flow
[ Developer Creates PVC ]
|
v
PVC References StorageClass
|
v
Kubernetes Requests Cloud Storage
|
v
Cloud Disk Created Automatically
|
v
PV Bound to PVC
StorageClass Example
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
In cloud environments, StorageClasses simplify storage automation.
Static Provisioning vs Dynamic Provisioning
| Type | How It Works | Best For |
|---|---|---|
| Static Provisioning | Admin manually creates PV | Controlled environments |
| Dynamic Provisioning | Kubernetes creates PV automatically | Cloud-native systems |
Persistent Storage in Banking System
[ Banking APIs ]
|
v
[ Transaction Database Pod ]
|
v
[ PVC ]
|
v
[ Encrypted Cloud Storage ]
In banking systems, storage must support:
- Durability
- Encryption
- Backups
- Disaster recovery
- Compliance auditing
A simple Pod restart should never affect transaction data.
Persistent Storage in E-Commerce System
[ Product Service ]
|
v
[ Product Images PVC ]
[ Order Service ]
|
v
[ Order Database PVC ]
[ Invoice Service ]
|
v
[ Invoice PDF Storage ]
Different services may use different storage strategies.
- Databases use block storage
- Images may use object storage
- Shared files may use NFS
Common Mistakes Developers Make
1. Using hostPath in Production
hostPath ties Pods to a specific node and can break during rescheduling.
2. Ignoring Access Modes
Using ReadWriteOnce for multiple replicas may cause mounting problems.
3. No Backup Strategy
Persistent storage does not automatically mean backup exists.
4. Not Monitoring Disk Usage
Storage may fill up and crash applications.
5. Running Databases Without PVC
This is dangerous and may cause data loss.
Production Debugging Workflow
Step 1: Check PVC status
kubectl get pvc
Step 2: Check PV status
kubectl get pv
Step 3: Describe PVC
kubectl describe pvc mysql-pvc
Step 4: Check Pod events
kubectl describe pod mysql-pod
Step 5: Check storage class
kubectl get storageclass
Common PVC Status Values
| Status | Meaning |
|---|---|
| Pending | PVC is waiting for matching PV or dynamic provisioning |
| Bound | PVC successfully connected to PV |
| Lost | Underlying PV is unavailable |
Interview Questions
Q1: What is a Kubernetes Volume?
A Volume is storage attached to a Pod and accessible by containers inside that Pod.
Q2: Difference between PV and PVC?
PV is actual storage resource. PVC is a request for storage by an application.
Q3: What is StorageClass?
StorageClass defines how Kubernetes dynamically provisions storage.
Q4: What is ReadWriteOnce?
It means storage can be mounted as read-write by one node.
Q5: Why avoid hostPath in production?
Because it ties storage to a specific worker node and reduces portability.
Interview Trap Questions
Does PVC automatically backup data?
No. Backup strategy must be configured separately.
Can multiple Pods write to ReadWriteOnce volume?
Usually no, unless they run on same allowed node depending on storage provider behavior.
Does deleting Pod delete PVC?
No. PVC remains unless explicitly deleted.
Can PersistentVolume exist without Pod?
Yes. PV is independent of Pods.
Recommended Learning Path
- Kubernetes Pods
- Kubernetes Deployments
- Kubernetes Services
- ConfigMaps
- Kubernetes Secrets
- Persistent Storage
- Kubernetes StatefulSets
Summary
Persistent storage is one of the most important Kubernetes concepts for real-world applications.
Pods are temporary, but business data must survive Pod restarts, crashes, deployments, and node failures.
Volumes, PersistentVolumes, PersistentVolumeClaims, and StorageClasses work together to provide reliable storage for stateful workloads.
For production systems such as banking, e-commerce, healthcare, and SaaS platforms, persistent storage must be designed carefully with backups, monitoring, encryption, and disaster recovery.
Understanding Kubernetes storage deeply helps developers build reliable, scalable, and enterprise-ready applications.