Load balancers distribute incoming traffic across multiple servers, enabling horizontal scaling, high availability, and performance optimization. They're essential for any production infrastructure.
What is a Load Balancer?
Load Balancer = Device/service that distributes traffic:
Users (requests)
↓
[Load Balancer - Decides where to send traffic]
┌─────┬─────────┬─────┐
↓ ↓ ↓ ↓
Server1 Server2 Server3 Server4
↑ ↑ ↑ ↑
└─────┴─────────┴─────┘
↓
[Responses aggregated]
↓
Users
Benefits:
- Scalability — Add more servers without single point of failure
- High Availability — If one server fails, traffic goes to others
- Performance — Distribute load fairly
- Session Persistence — Keep user on same server
Load Balancer Levels
Layer 4 (Transport) - L4LB:
- Based on: TCP/UDP, source/dest IP, source/dest port
- Speed: Very fast (hardware level)
- Intelligence: Low
- Example: nginx, HAProxy (TCP mode), AWS Network LB
- Use when: Simple TCP/UDP load balancing
Layer 7 (Application) - L7LB:
- Based on: HTTP headers, URL paths, hostnames, cookies
- Speed: Slower (inspects full requests)
- Intelligence: High
- Example: nginx, HAProxy (HTTP mode), AWS Application LB
- Use when: Content-based routing
Load Balancing Algorithms
| Algorithm | Behavior | When to Use |
|---|---|---|
| Round-Robin | Each server in turn | Equal capacity servers |
| Weighted | Assign weight to each | Different server sizes |
| Least Connections | Fewest active connections | Variable request duration |
| IP Hash | Same IP → same server | Session persistence |
| Random | Random selection | Simple, distributed |
| Least Response Time | Fastest responding server | Performance critical |
Round-Robin Example:
Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1 (cycle repeats)
Weighted Example:
Server 1 (weight 2): 50% of traffic
Server 2 (weight 1): 25% of traffic
Server 3 (weight 1): 25% of traffic
Request 1,2 → Server 1
Request 3 → Server 2
Request 4 → Server 3
Request 5,6 → Server 1
Load Balancer Configuration
nginx - Layer 7 Load Balancer:
upstream backend {
# Round-robin by default
server 192.168.1.101:8080;
server 192.168.1.102:8080;
server 192.168.1.103:8080;
# Alternative: weighted
# server 192.168.1.101:8080 weight=2;
# server 192.168.1.102:8080 weight=1;
# Alternative: least connections
# least_conn;
}
server {
listen 80;
server_name api.example.com;
client_max_body_size 100M;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Connection optimization
proxy_http_version 1.1;
proxy_set_header Connection "";
}
# Health check endpoint
location /health {
access_log off;
return 200 "healthy\n";
}
}HAProxy - L4/L7 Load Balancer:
frontend web_frontend
bind *:80
mode http
default_backend web_backend
backend web_backend
mode http
balance roundrobin # or: leastconn, random, source
# Define backend servers
server web1 192.168.1.101:8080 check
server web2 192.168.1.102:8080 check
server web3 192.168.1.103:8080 check
# Health checks
option httpchk GET /health
http-check expect status 200
Health Checks
Health Checks:
When servers are down or healthy:
- Healthy server: LB checks → HTTP 200 OK ✓ → Include in pool
- Failed server: LB checks → HTTP 500 Error ✗ → Remove from pool
- After recovery: LB checks → HTTP 200 OK ✓ → Add back to pool
Types of Health Checks:
HTTP GET /health
├─ Status 200 = healthy
└─ Status != 200 = unhealthy
TCP Connection
├─ Can connect = healthy
└─ Connection refused = unhealthy
Custom Script
├─ Exit code 0 = healthy
└─ Exit code != 0 = unhealthynginx Health Check:
upstream backend {
server 192.168.1.101:8080 max_fails=3 fail_timeout=30s;
server 192.168.1.102:8080 max_fails=3 fail_timeout=30s;
}Means: If 3 failures in 30 seconds, mark unhealthy for 30s.
Session Persistence
Session Persistence Problem:
Server 1: "Your cart: [item A]"
Server 2: "Your cart: [empty]" ← Lost session!
Solution: Sticky Sessions (IP Hash):
upstream backend {
hash $remote_addr consistent;
server 192.168.1.101:8080;
server 192.168.1.102:8080;
server 192.168.1.103:8080;
}Now: Same client IP → Always same server
Alternative: Shared Session Store:
Server 1 → Write session to Redis
Server 2 → Read session from Redis
Any server can handle user requests
SSL/TLS Termination
Client-LB-Server Flow:
Client: HTTPS request (encrypted)
↓
[Load Balancer] (SSL termination point)
├─ Decrypt using certificate
├─ Route to backend
↓
Server: Receive unencrypted HTTP
nginx SSL Termination:
server {
listen 443 ssl http2;
ssl_certificate /etc/ssl/certs/server.crt;
ssl_certificate_key /etc/ssl/private/server.key;
location / {
# Connection to backend is HTTP (unencrypted)
proxy_pass http://backend;
proxy_set_header X-Forwarded-Proto $scheme;
}
}benefit: Heavy encryption work done at LB, backends focus on app logic
Cloud Load Balancers
AWS Elastic Load Balancer (ELB):
# Create load balancer
aws elb create-load-balancer \
--load-balancer-name my-lb \
--listeners Protocol=HTTP,LoadBalancerPort=80,InstancePort=8080
# Register instances
aws elb register-instances-with-load-balancer \
--load-balancer-name my-lb \
--instances i-1234567890abcdef0 i-0987654321fedcba0
# Configure health check
aws elb configure-health-check \
--load-balancer-name my-lb \
--health-check Target=HTTP:8080/health,Interval=30,Timeout=5,HealthyThreshold=2,UnhealthyThreshold=2GCP Load Balancer:
# Create instance group
gcloud compute instance-groups create web-servers \
--zone us-central1-a \
--size 3
# Create health check
gcloud compute health-checks create http web-health \
--port 8080 \
--request-path /health
# Create load balancer
gcloud compute forwarding-rules create web-lb \
--load-balancing-scheme EXTERNAL \
--target-http-proxy web-proxyKubernetes Service LoadBalancer:
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- port: 80
targetPort: 8080Kubernetes automatically:
- Creates load balancer
- Registers pod IPs
- Manages traffic distribution
- Performs health checks
Global Load Balancing
Global Load Balancing Purpose:
US User → Route to US datacenter
EU User → Route to EU datacenter
Asia User → Route to Asia datacenter
Methods:
- GeoDNS — DNS resolves to different IPs by location
user1.example.com (US IP) → 203.0.113.1
user2.example.com (EU IP) → 198.51.100.1
- Anycast — Multiple datacenters have same IP
BGP routes user to nearest datacenter
Troubleshooting Load Balancer
"Traffic not distributed evenly"
Check algorithm:
nginx: hash $remote_addr too sticky?
Try: least_conn instead
Check server weights:
aws elb: All instances same weight?
Check instance health
Check health checks:
Failing? Would remove that server
"Sessions lost after LB restart"
Using sticky sessions?
→ IP → Server1 mapping lost
→ Same user → Server2
→ Session lost
Solution:
Move to shared session store (Redis)
Key Concepts
- Load Balancer = Distributes traffic across servers
- L4 LB = TCP/UDP based, fast
- L7 LB = HTTP based, smart routing
- Round-Robin = Equal distribution
- Weighted = Proportional to server capacity
- Health Check = Detect failed servers
- Session Persistence = Keep user on same server
- SSL Termination = Decrypt at LB, not backend
- Affinity = Route same client to same server
- Use load balancers for high availability