Benjamin Shawki
Potters.Tools

Engineering Zero-Downtime Deployments: The Hemiron Rollback System

DevOpsReverse ProxyBlue-Green DeploymentNginxTraefikZero-DowntimeDockerGitLab CI/CD

A Performance Study of Traefik vs Nginx in Blue-Green Deployment Strategies

In modern web applications, minimizing downtime during deployments is crucial. As part of the Hemiron project at Hogeschool Leiden, I conducted a comprehensive performance comparison between Traefik and Nginx for implementing instant rollback features in blue-green deployment strategies.

This article compares Traefik and Nginx in the context of blue-green deployments, including architecture details, and key insights for implementing reliable rollback systems.

Understanding Blue-Green Deployments

In blue-green deployments, two identical environments (blue and green) exist simultaneously, with only one environment serving production traffic at any time. When a new version is deployed, it's first released to the inactive environment. Once verified, traffic is redirected to this environment, minimizing downtime and risk.

The key innovation was implementing a system where each deployment receives a unique subdomain (e.g., commit-sha.example.com), while the main domain (example.com) points to the current production environment. This setup allows for instant rollbacks by simply updating the DNS or proxy configuration to point to a previous, still-running version.

Comparison

To accurately compare Traefik and Nginx, I designed a controlled experiment using:

  1. VPS Environments: Tests were conducted on both low-spec (1vCPU, 1GB RAM) and high-spec (4vCPU, 8GB RAM) VPS instances
  2. Docker Containers: Each test environment used containerized applications and reverse proxies
  3. GitLab CI/CD Pipeline: Automated deployment and rollback processes to ensure consistency
  4. DNS Wildcard Configuration: Setup that allowed access to each deployment via unique subdomains
  5. Measurement Scripts: Custom scripts that measured rollback execution time with millisecond precision

The test pipeline included these key steps:

  • Prepare environment and export scripts to the server
  • Build and deploy frontend and backend Docker containers
  • Switch the root domain to point to the new deployment
  • Initiate rollback and measure execution time
  • Collect and analyze performance data

Architecture Comparison

Traefik Implementation

Traefik is designed specifically for microservices architectures and offers container-aware routing through Docker labels. Its dynamic configuration makes it particularly easy to set up.

yaml
# Example docker-compose.yml for Traefik
version: '3'
services:
  traefik:
    image: traefik:v2.5
    command:
      - '--providers.docker=true'
      - '--providers.docker.exposedByDefault=false'
      - '--entrypoints.web.address=:80'
    ports:
      - '80:80'
      - '443:443'
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

  app:
    image: myapp:latest
    labels:
      - 'traefik.enable=true'
      - 'traefik.http.routers.app.rule=Host(`app.example.com`)'
      - 'traefik.http.services.app.loadbalancer.server.port=80'

Nginx Implementation

Nginx required a more manual configuration approach but offered greater performance and flexibility:

nginx
# Example Nginx configuration for blue-green deployment
upstream blue {
    server blue-app:80;
}

upstream green {
    server green-app:80;
}

server {
    listen 80;
    server_name example.com;

    # Production traffic goes to current active environment (green or blue)
    location / {
        proxy_pass http://green;
    }
}

# Individual version access
server {
    listen 80;
    server_name blue.example.com;

    location / {
        proxy_pass http://blue;
    }
}

server {
    listen 80;
    server_name green.example.com;

    location / {
        proxy_pass http://green;
    }
}

Key Implementation Differences

1. Configuration Approach

The most significant difference between Traefik and Nginx was in their configuration approaches:

Traefik:

  • Uses Docker labels for dynamic configuration
  • Automatically detects container changes
  • No need for manual configuration file updates
  • Built-in Let's Encrypt integration for SSL certificates

Nginx:

  • Requires separate configuration files
  • Configuration updates need explicit reloads
  • More precise control over routing logic
  • Manual SSL certificate management

2. Rollback Implementation

The rollback process differed significantly between the two solutions:

Traefik Rollback Process:

bash
#!/bin/bash
# Simplified rollback script for Traefik
START_TIME=$(date +%s%N)

# Update root domain to point to previous version
docker-compose -f docker-compose-traefik.yml down -v app-frontend
docker-compose -f docker-compose-traefik.yml up -d app-frontend-previous

END_TIME=$(date +%s%N)
ELAPSED_TIME=$((($END_TIME - $START_TIME)/1000000))
echo "Traefik rollback completed in $ELAPSED_TIME ms"

Nginx Rollback Process:

bash
#!/bin/bash
# Simplified rollback script for Nginx
START_TIME=$(date +%s%N)

# Update nginx configuration to route to previous version
cat > /etc/nginx/conf.d/app.conf << EOF
upstream active {
  server app-previous:80;
}
EOF

# Reload Nginx configuration
docker exec nginx-proxy nginx -s reload

END_TIME=$(date +%s%N)
ELAPSED_TIME=$((($END_TIME - $START_TIME)/1000000))
echo "Nginx rollback completed in $ELAPSED_TIME ms"

3. Performance Monitoring

Both implementations included health checks and performance monitoring:

  • Each deployment version had a /health endpoint
  • Automated healthchecks triggered rollbacks when needed
  • All rollback events were logged with timing information
  • Prometheus metrics were collected for long-term analysis

Conclusion

The architectural approach outlined in this article—maintaining multiple deployment versions with unique subdomains and implementing instant rollbacks through proxy configuration updates—provides significant benefits regardless of which reverse proxy solution is chosen:

  • Zero-downtime deployments across all releases
  • Recovery times measured in milliseconds rather than minutes
  • Improved deployment confidence leading to more frequent releases
  • Enhanced user experience through elimination of service interruptions

If you're interested in learning more about this or have questions about implementing similar architectures, feel free to reach out through the contact details on my About page.

References

  1. Nginx Documentation
  2. Traefik Documentation
  3. Blue-Green Deployment Pattern
  4. Docker Documentation
  5. GitLab CI/CD Documentation
  6. DNS Configuration for Zero-Downtime Deployments