Load Balancers: Types & Algorithms - Networking Basics

Load balancers distribute incoming traffic across multiple servers, enabling horizontal scaling, high availability, and performance optimization. They're essential for any production infrastructure.

What is a Load Balancer?

Load Balancer = Device/service that distributes traffic:

Users (requests)
    ↓
[Load Balancer - Decides where to send traffic]
    ┌─────┬─────────┬─────┐
    ↓     ↓         ↓     ↓
 Server1 Server2 Server3 Server4
   ↑     ↑         ↑     ↑
└─────┴─────────┴─────┘
    ↓
[Responses aggregated]
    ↓
  Users

Benefits:

Scalability — Add more servers without single point of failure
High Availability — If one server fails, traffic goes to others
Performance — Distribute load fairly
Session Persistence — Keep user on same server

Load Balancer Levels

Layer 4 (Transport) - L4LB:

Based on: TCP/UDP, source/dest IP, source/dest port
Speed: Very fast (hardware level)
Intelligence: Low
Example: nginx, HAProxy (TCP mode), AWS Network LB
Use when: Simple TCP/UDP load balancing

Layer 7 (Application) - L7LB:

Based on: HTTP headers, URL paths, hostnames, cookies
Speed: Slower (inspects full requests)
Intelligence: High
Example: nginx, HAProxy (HTTP mode), AWS Application LB
Use when: Content-based routing

Load Balancing Algorithms

Algorithm	Behavior	When to Use
Round-Robin	Each server in turn	Equal capacity servers
Weighted	Assign weight to each	Different server sizes
Least Connections	Fewest active connections	Variable request duration
IP Hash	Same IP → same server	Session persistence
Random	Random selection	Simple, distributed
Least Response Time	Fastest responding server	Performance critical

Round-Robin Example:

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1  (cycle repeats)

Weighted Example:

Server 1 (weight 2): 50% of traffic
Server 2 (weight 1): 25% of traffic
Server 3 (weight 1): 25% of traffic

Request 1,2 → Server 1
Request 3   → Server 2
Request 4   → Server 3
Request 5,6 → Server 1

Load Balancer Configuration

nginx - Layer 7 Load Balancer:

upstream backend {
    # Round-robin by default
    server 192.168.1.101:8080;
    server 192.168.1.102:8080;
    server 192.168.1.103:8080;
    
    # Alternative: weighted
    # server 192.168.1.101:8080 weight=2;
    # server 192.168.1.102:8080 weight=1;
    
    # Alternative: least connections
    # least_conn;
}
 
server {
    listen 80;
    server_name api.example.com;
    
    client_max_body_size 100M;
    
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Connection optimization
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
    
    # Health check endpoint
    location /health {
        access_log off;
        return 200 "healthy\n";
    }
}

HAProxy - L4/L7 Load Balancer:

frontend web_frontend
    bind *:80
    mode http
    default_backend web_backend

backend web_backend
    mode http
    balance roundrobin  # or: leastconn, random, source
    
    # Define backend servers
    server web1 192.168.1.101:8080 check
    server web2 192.168.1.102:8080 check
    server web3 192.168.1.103:8080 check
    
    # Health checks
    option httpchk GET /health
    http-check expect status 200

Health Checks

Health Checks:

When servers are down or healthy:

Healthy server: LB checks → HTTP 200 OK ✓ → Include in pool
Failed server: LB checks → HTTP 500 Error ✗ → Remove from pool
After recovery: LB checks → HTTP 200 OK ✓ → Add back to pool

Types of Health Checks:

HTTP GET /health
  ├─ Status 200 = healthy
  └─ Status != 200 = unhealthy
 
TCP Connection
  ├─ Can connect = healthy
  └─ Connection refused = unhealthy
 
Custom Script
  ├─ Exit code 0 = healthy
  └─ Exit code != 0 = unhealthy

nginx Health Check:

upstream backend {
    server 192.168.1.101:8080 max_fails=3 fail_timeout=30s;
    server 192.168.1.102:8080 max_fails=3 fail_timeout=30s;
}

Means: If 3 failures in 30 seconds, mark unhealthy for 30s.

Session Persistence

Session Persistence Problem:

Server 1: "Your cart: [item A]"
Server 2: "Your cart: [empty]"  ← Lost session!

Solution: Sticky Sessions (IP Hash):

upstream backend {
    hash $remote_addr consistent;
    server 192.168.1.101:8080;
    server 192.168.1.102:8080;
    server 192.168.1.103:8080;
}

Now: Same client IP → Always same server

Alternative: Shared Session Store:

Server 1 → Write session to Redis
Server 2 → Read session from Redis

Any server can handle user requests

SSL/TLS Termination

Client-LB-Server Flow:

Client: HTTPS request (encrypted)
    ↓
[Load Balancer] (SSL termination point)
  ├─ Decrypt using certificate
  ├─ Route to backend
    ↓
Server: Receive unencrypted HTTP

nginx SSL Termination:

server {
    listen 443 ssl http2;
    ssl_certificate /etc/ssl/certs/server.crt;
    ssl_certificate_key /etc/ssl/private/server.key;
    
    location / {
        # Connection to backend is HTTP (unencrypted)
        proxy_pass http://backend;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

benefit: Heavy encryption work done at LB, backends focus on app logic

Cloud Load Balancers

AWS Elastic Load Balancer (ELB):

# Create load balancer
aws elb create-load-balancer \
  --load-balancer-name my-lb \
  --listeners Protocol=HTTP,LoadBalancerPort=80,InstancePort=8080
 
# Register instances
aws elb register-instances-with-load-balancer \
  --load-balancer-name my-lb \
  --instances i-1234567890abcdef0 i-0987654321fedcba0
 
# Configure health check
aws elb configure-health-check \
  --load-balancer-name my-lb \
  --health-check Target=HTTP:8080/health,Interval=30,Timeout=5,HealthyThreshold=2,UnhealthyThreshold=2

GCP Load Balancer:

# Create instance group
gcloud compute instance-groups create web-servers \
  --zone us-central1-a \
  --size 3
 
# Create health check
gcloud compute health-checks create http web-health \
  --port 8080 \
  --request-path /health
 
# Create load balancer
gcloud compute forwarding-rules create web-lb \
  --load-balancing-scheme EXTERNAL \
  --target-http-proxy web-proxy

Kubernetes Service LoadBalancer:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
  - port: 80
    targetPort: 8080

Kubernetes automatically:

Creates load balancer
Registers pod IPs
Manages traffic distribution
Performs health checks

Global Load Balancing

Global Load Balancing Purpose:

US User → Route to US datacenter
EU User → Route to EU datacenter
Asia User → Route to Asia datacenter

Methods:

GeoDNS — DNS resolves to different IPs by location

user1.example.com (US IP) → 203.0.113.1
user2.example.com (EU IP) → 198.51.100.1

Anycast — Multiple datacenters have same IP

BGP routes user to nearest datacenter

Troubleshooting Load Balancer

"Traffic not distributed evenly"

Check algorithm:
nginx: hash $remote_addr too sticky?
       Try: least_conn instead

Check server weights:
aws elb: All instances same weight?
       Check instance health

Check health checks:
Failing? Would remove that server

"Sessions lost after LB restart"

Using sticky sessions?
→ IP → Server1 mapping lost
→ Same user → Server2
→ Session lost

Solution:
Move to shared session store (Redis)

Key Concepts

Load Balancer = Distributes traffic across servers
L4 LB = TCP/UDP based, fast
L7 LB = HTTP based, smart routing
Round-Robin = Equal distribution
Weighted = Proportional to server capacity
Health Check = Detect failed servers
Session Persistence = Keep user on same server
SSL Termination = Decrypt at LB, not backend
Affinity = Route same client to same server
Use load balancers for high availability