Engineering Zero-Downtime Deployments: The Hemiron Rollback System

May 18, 2025

DevOpsReverse ProxyBlue-Green DeploymentNginxTraefikZero-DowntimeDockerGitLab CI/CD

A Performance Study of Traefik vs Nginx in Blue-Green Deployment Strategies

In modern web applications, minimizing downtime during deployments is crucial. As part of the Hemiron project at Hogeschool Leiden, I conducted a comprehensive performance comparison between Traefik and Nginx for implementing instant rollback features in blue-green deployment strategies.

This article compares Traefik and Nginx in the context of blue-green deployments, including architecture details, and key insights for implementing reliable rollback systems.

Understanding Blue-Green Deployments

In blue-green deployments, two identical environments (blue and green) exist simultaneously, with only one environment serving production traffic at any time. When a new version is deployed, it's first released to the inactive environment. Once verified, traffic is redirected to this environment, minimizing downtime and risk.

The key innovation was implementing a system where each deployment receives a unique subdomain (e.g., commit-sha.example.com), while the main domain (example.com) points to the current production environment. This setup allows for instant rollbacks by simply updating the DNS or proxy configuration to point to a previous, still-running version.

Comparison

To accurately compare Traefik and Nginx, I designed a controlled experiment using:

VPS Environments: Tests were conducted on both low-spec (1vCPU, 1GB RAM) and high-spec (4vCPU, 8GB RAM) VPS instances
Docker Containers: Each test environment used containerized applications and reverse proxies
GitLab CI/CD Pipeline: Automated deployment and rollback processes to ensure consistency
DNS Wildcard Configuration: Setup that allowed access to each deployment via unique subdomains
Measurement Scripts: Custom scripts that measured rollback execution time with millisecond precision

The test pipeline included these key steps:

Prepare environment and export scripts to the server
Build and deploy frontend and backend Docker containers
Switch the root domain to point to the new deployment
Initiate rollback and measure execution time
Collect and analyze performance data

Architecture Comparison

Traefik Implementation

Traefik is designed specifically for microservices architectures and offers container-aware routing through Docker labels. Its dynamic configuration makes it particularly easy to set up.

yaml

# Example docker-compose.yml for Traefik
version: '3'
services:
  traefik:
    image: traefik:v2.5
    command:
      - '--providers.docker=true'
      - '--providers.docker.exposedByDefault=false'
      - '--entrypoints.web.address=:80'
    ports:
      - '80:80'
      - '443:443'
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

  app:
    image: myapp:latest
    labels:
      - 'traefik.enable=true'
      - 'traefik.http.routers.app.rule=Host(`app.example.com`)'
      - 'traefik.http.services.app.loadbalancer.server.port=80'

Nginx Implementation

Nginx required a more manual configuration approach but offered greater performance and flexibility:

nginx

# Example Nginx configuration for blue-green deployment
upstream blue {
    server blue-app:80;
}

upstream green {
    server green-app:80;
}

server {
    listen 80;
    server_name example.com;

    # Production traffic goes to current active environment (green or blue)
    location / {
        proxy_pass http://green;
    }
}

# Individual version access
server {
    listen 80;
    server_name blue.example.com;

    location / {
        proxy_pass http://blue;
    }
}

server {
    listen 80;
    server_name green.example.com;

    location / {
        proxy_pass http://green;
    }
}

Key Implementation Differences

1. Configuration Approach

The most significant difference between Traefik and Nginx was in their configuration approaches:

Traefik:

Uses Docker labels for dynamic configuration
Automatically detects container changes
No need for manual configuration file updates
Built-in Let's Encrypt integration for SSL certificates

Nginx:

Requires separate configuration files
Configuration updates need explicit reloads
More precise control over routing logic
Manual SSL certificate management

2. Rollback Implementation

The rollback process differed significantly between the two solutions:

Traefik Rollback Process:

bash

#!/bin/bash
# Simplified rollback script for Traefik
START_TIME=$(date +%s%N)

# Update root domain to point to previous version
docker-compose -f docker-compose-traefik.yml down -v app-frontend
docker-compose -f docker-compose-traefik.yml up -d app-frontend-previous

END_TIME=$(date +%s%N)
ELAPSED_TIME=$((($END_TIME - $START_TIME)/1000000))
echo "Traefik rollback completed in $ELAPSED_TIME ms"

Nginx Rollback Process:

bash

#!/bin/bash
# Simplified rollback script for Nginx
START_TIME=$(date +%s%N)

# Update nginx configuration to route to previous version
cat > /etc/nginx/conf.d/app.conf << EOF
upstream active {
  server app-previous:80;
}
EOF

# Reload Nginx configuration
docker exec nginx-proxy nginx -s reload

END_TIME=$(date +%s%N)
ELAPSED_TIME=$((($END_TIME - $START_TIME)/1000000))
echo "Nginx rollback completed in $ELAPSED_TIME ms"

3. Performance Monitoring

Both implementations included health checks and performance monitoring:

Each deployment version had a /health endpoint
Automated healthchecks triggered rollbacks when needed
All rollback events were logged with timing information
Prometheus metrics were collected for long-term analysis

Conclusion

The architectural approach outlined in this article—maintaining multiple deployment versions with unique subdomains and implementing instant rollbacks through proxy configuration updates—provides significant benefits regardless of which reverse proxy solution is chosen:

Zero-downtime deployments across all releases
Recovery times measured in milliseconds rather than minutes
Improved deployment confidence leading to more frequent releases
Enhanced user experience through elimination of service interruptions

If you're interested in learning more about this or have questions about implementing similar architectures, feel free to reach out through the contact details on my About page.