NZroot.dev | Introduction to Kubernetes Scaling

Kubernetes is the de facto standard for container orchestration. One of its most powerful features is the ability to scale workloads automatically.

Horizontal Pod Autoscaler (HPA)

The HPA automatically scales the number of Pods in a replication controller, deployment, replica set, or stateful set based on observed CPU utilization.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

This configuration ensures that your application always has enough resources to handle the current load, without wasting money on idle infrastructure.

Why Scale?

Cost Efficiency: Pay only for what you use.
Reliability: Handle traffic spikes without crashing.
Performance: Maintain low latency for users.