Resource Management: Requests and Limits

Efficient resource management is one of the most important aspects of running applications in Kubernetes. Pods consume CPU and memory resources from nodes, and without proper configuration, they can either starve other workloads or fail due to insufficient resources. Kubernetes provides requests and limits to control how resources are allocated and consumed.

Understanding Requests and Limits

Requests

A request is the minimum amount of CPU or memory guaranteed to a container. The Kubernetes scheduler uses requests to decide which node can run the Pod. If a node doesn’t have enough available resources to satisfy the request, the Pod won’t be scheduled there.

Limits

A limit is the maximum amount of CPU or memory a container can use. If a container tries to exceed its memory limit, it will be terminated. If it exceeds its CPU limit, Kubernetes throttles its CPU usage.

YAML Example: Requests and Limits

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: demo-container
    image: nginx
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"

Explanation: This Pod requests 256Mi of memory and 250m (0.25 CPU cores). It cannot exceed 512Mi of memory or 0.5 CPU cores.

Flowchart: Resource Allocation


   Pod created ---> Scheduler checks requests ---> Finds suitable node
          |
          v
   Pod runs ---> Container consumes resources ---> Limited by defined limits

Why Requests and Limits Matter

Fairness: Prevents one workload from consuming all resources.
Stability: Ensures Pods don’t crash due to resource starvation.
Predictability: Guarantees performance by reserving resources.
Scalability: Helps autoscalers make decisions based on resource usage.

Real-Time Example

In a payment microservice:

Requests: 200m CPU and 256Mi memory to ensure smooth operation.
Limits: 500m CPU and 512Mi memory to prevent runaway resource usage during traffic spikes.
Outcome: The service remains responsive without affecting other workloads.

Common Mistakes

Not setting requests and limits, leading to unpredictable scheduling.
Setting requests too high, preventing Pods from being scheduled.
Setting limits too low, causing Pods to be killed under load.
Confusing requests with limits—requests are guarantees, limits are caps.

Interview Notes

Q1: What is the difference between requests and limits?

Answer: Requests guarantee minimum resources for scheduling, while limits cap the maximum resources a container can use.

Q2: What happens if a container exceeds its memory limit?

Answer: The container is terminated, and Kubernetes may restart it depending on its restart policy.

Q3: How does Kubernetes handle CPU limits?

Answer: Kubernetes throttles CPU usage if a container exceeds its CPU limit.

Q4: Example Interview Task

apiVersion: apps/v1
kind: Deployment
metadata:
  name: resource-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: demo
  template:
    metadata:
      labels:
        app: demo
    spec:
      containers:
      - name: demo-container
        image: nginx
        resources:
          requests:
            cpu: "200m"
            memory: "256Mi"
          limits:
            cpu: "400m"
            memory: "512Mi"

Explanation: This Deployment ensures each Pod has guaranteed resources while preventing excessive usage.

Advanced Notes

ResourceQuotas: Limit total resource usage per namespace.
LimitRanges: Define default requests and limits for Pods in a namespace.
Horizontal Pod Autoscaler (HPA): Uses resource metrics to scale Pods automatically.
Best Practices: Always set requests and limits, monitor usage, and adjust based on workload patterns.

Summary

Requests and limits are essential for Kubernetes resource management. Requests guarantee minimum resources for scheduling, while limits cap maximum usage. Proper configuration ensures fairness, stability, and scalability. By avoiding common mistakes and understanding advanced features like ResourceQuotas and LimitRanges, developers can build resilient applications and confidently answer interview questions on Kubernetes resource management.