G
GuideDevOps
Lesson 9 of 17

Volumes & Persistent Storage

Part of the Kubernetes tutorial series.

The Problem

By default, data inside a container is lost when the container restarts:

Pod starts → Container writes data
   ↓
Pod crashes → Container data lost
   ↓
Kubernetes restarts Pod → Data gone!

To persist data across Pod restarts, we use Volumes.


Volume Types

Ephemeral Volumes (Temporary)

emptyDir

Temporary directory shared by containers in a Pod, deleted when Pod is deleted.

apiVersion: v1
kind: Pod
metadata:
  name: cache-pod
spec:
  containers:
  - name: app
    image: myapp:latest
    volumeMounts:
    - name: cache
      mountPath: /var/cache
  - name: sidecar
    image: cache-sidecar:latest
    volumeMounts:
    - name: cache
      mountPath: /cache
  volumes:
  - name: cache
    emptyDir: {}

Use cases:

  • Temporary scratch space
  • Caching between containers
  • Shared state in multi-container Pods

Persistent Volumes (Long-term)

1. PersistentVolume (PV)

Represents actual storage in the cluster.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-database
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce        # Single pod can read/write
  storageClassName: fast-ssd
  awsElasticBlockStore:
    volumeID: vol-12345678
    fsType: ext4

2. PersistentVolumeClaim (PVC)

A request for storage (like filling out a purchase order).

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 50Gi        # Need 50 GB

Kubernetes matches the PVC to a PV with enough capacity.

3. Pod Using PVC

apiVersion: v1
kind: Pod
metadata:
  name: db-pod
spec:
  containers:
  - name: postgres
    image: postgres:14
    volumeMounts:
    - name: data
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: database-claim

StorageClass (Dynamic Provisioning)

Instead of manually creating PVs, use StorageClass to automatically provision storage:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com    # AWS EBS provisioner
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  fstype: ext4
 
---
 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-storage
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd      # Reference StorageClass
  resources:
    requests:
      storage: 100Gi
# Kubernetes automatically provisions an EBS volume!

Built-in Storage Classes

kubectl get storageclass

Common provisioners:

  • AWS: ebs.csi.aws.com
  • Google Cloud: pd.csi.storage.gke.io
  • Azure: disk.csi.azure.com
  • NFS: nfs.csi.k8s.io

Access Modes

ReadWriteOnce (RWO)

Data can be read/written by a single Pod. Most commonly used.

accessModes:
  - ReadWriteOnce

ReadOnlyMany (ROX)

Data can be read by multiple Pods but not written.

accessModes:
  - ReadOnlyMany

Use case: Shared read-only configuration

ReadWriteMany (RWX)

Data can be read/written by multiple Pods simultaneously.

accessModes:
  - ReadWriteMany

Use case: Shared storage for multiple workers (NFS, EFS)


Practical Examples

Database with Persistent Storage

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: standard
  resources:
    requests:
      storage: 100Gi
 
---
 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: database
  template:
    metadata:
      labels:
        app: database
    spec:
      containers:
      - name: postgres
        image: postgres:14
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: password
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: postgres-pvc

Shared Config Volume (ReadOnlyMany)

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  config.yaml: |
    server:
      port: 8080
      workers: 4
 
---
 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: app
        image: myapp:latest
        volumeMounts:
        - name: config
          mountPath: /etc/config
          readOnly: true
      volumes:
      - name: config
        configMap:
          name: app-config

Volume Lifecycle

PV States

Available
   ↓
(PVC created)
   ↓
Bound (attached to PVC)
   ↓
(PVC deleted)
   ↓
Released (no longer bound, data retained)
   ↓
Delete/Recycle (based on reclaim policy)

Reclaim Policies

What happens to a PV when a PVC is deleted:

spec:
  persistentVolumeReclaimPolicy: Delete    # Delete storage (default for cloud)
  # or
  persistentVolumeReclaimPolicy: Retain    # Keep storage (manual cleanup)
  # or
  persistentVolumeReclaimPolicy: Recycle   # Clear data, make available for reuse

Snapshots

Create point-in-time backups of volumes:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: database-snapshot
spec:
  volumeSnapshotClassName: csi-snapshotter
  source:
    persistentVolumeClaimName: postgres-pvc
 
---
 
# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-restored
spec:
  dataSource:
    name: database-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi

Best Practices

Use StorageClass for automatic provisioning

  • Easier than manual PV creation
  • Scales automatically

Set resource requests for storage

resources:
  requests:
    storage: 100Gi

Monitor storage usage

kubectl get pvc
kubectl describe pvc database-claim

Backup important data Use volume snapshots or external backup solutions

Use appropriate reclaim policy

  • Retain for important databases (manual cleanup)
  • Delete for temporary storage

Don't lose track of orphaned PVs Regularly check:

kubectl get pv
kubectl get pvc -A

Don't exceed PV capacity Monitor and resize as needed