Argo Rollouts v1.2

Published in

Argo Project

4 min readMar 22, 2022

The Argo team is happy to announce the general availability of Argo Rollouts v1.2. Argo Rollouts is a Kubernetes progressive delivery operator providing advanced blue-green and canary deployment strategies with automated promotions, rollbacks, and metric analysis. Read on for some of the highlights of the release.

Analysis Dry Run

One of the more challenging aspects of adopting progressive delivery is building sufficient trust in your metrics to allow them to block an update. Every service is different and may have its unique requirements. For example, the latency tolerations of one application may be very different than another or even change over time. It typically takes practice, and fine-tuning of your analysis queries before they can be trusted to abort an update.

To help ease this transition, an analysis “dry run” option has been introduced. During an update, an analysis metric marked to run in a dry-run mode will execute normally, but the failure of that metric will not cause the rollout to abort. Users will still see that the analysis passed or failed so that the next time they can either tweak the query or do a wet run. To use the dry run feature, specify a list of metric names or a regular expression to the dryRun field:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: dry-run-rollout
spec:
  strategy:
    canary:
      steps:
      - analysis:
          templates:
          - templateName: success-rate
          dryRun:
          - metricName: .*

This allows you to practice and fine-tune your analysis in a real-world environment, before actually letting it gate your updates.

Weighted Experiment Steps

Ever since the Experiment feature was first introduced, it has been possible to run Experiments as a step in the rollout. Experiments are helpful for some advanced canarying techniques where traffic splitting is not sufficient (e.g., A/B testing or baseline/canary analysis). However, this feature never leveraged the fine-grained traffic weight splitting capabilities of service meshes and ingress controllers, which limited its usefulness to just the “basic” canary strategy. Now with Rollouts v1.2, it is possible to specify a weight percentage in an experiment step:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: weighted-experiment-rollout
spec:
  strategy:
    canary:
      steps:
      - experiment:
          duration: 1h
          templates:
          - name: baseline
            specRef: stable
            weight: 2
          - name: canary
            specRef: canary
            weight: 2

In the above example, during the update, the rollout will run a one hour experiment, where 2% of the traffic will go to a new canary pod, another 2% will go to a new baseline pod, and the remaining 96% will go to the existing stable pods. This feature makes it possible to run smaller, safer experiments, decreasing the blast radius of a potential bad update.

Ping-Pong Service Management

Rollout users using the AWS LoadBalancer controller and its Pod readiness gates feature may have noticed an issue with pod readiness gates not being injected expectedly. This is due to the way AWS LoadBalancer controller decides to inject readiness gates based on Service selector labels at the time of pod creation. In short, modifying a Service selector label to point to a different pod that is already running does not allow the proper readiness gates to be injected. Since Rollouts works by modifying the “stable” selector after a successful update, it prevented readiness gates from being injected. Note that this was also the motivation for the target group verification feature introduced in the previous Rollouts v1.1 release.

With Rollouts v1.2, we are providing an alternative way of Service management, named “Ping-Pong” service management, which is compatible with the way AWS Load Balancer works. Using this approach, it is no longer necessary to specify a “canary” and “stable” service, specifying a “ping” and “pong” service, instead:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: ping-pong-rollout
spec:
  strategy:
    canary: 
      pingPong:
        pingService: ping-service
        pongService: pong-service
      trafficRouting:
        alb:
          ingress: alb-ingress
          servicePort: 80

During updates, the rollout controller alternate which service will act as the canary (ping vs. pong) and modifies service selectors before creating the new canary pods. On every update, traffic will shift back and forth between the ping and pong service (thus the name ping-pong), and the role of the canary would also change between the ping and pong services. The technique of modifying service selectors before pod creation ensures the proper pod readiness gates get injected by the AWS LoadBalancer controller and helps ensure zero-downtime updates with ALBs.

Other Notable Features

This release also included a slew of other significant improvements, including:

AWS AppMesh Traffic Routing Support
High Availability (active-passive) using Kubernetes leader election
Support for multiple/simultaneous traffic providers
Customizable Metric Retention
Support for PUT/POST in web metric providers
Additional metadata from analysis providers (i.e., for debugging)
Scalability & performance improvements

Ready to get started with Rollouts v1.2? Download the installation manifests from our GitHub v1.2 release page.

Argo Rollouts v1.2

Analysis Dry Run

Weighted Experiment Steps

Ping-Pong Service Management

Other Notable Features

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Argo Project

Written by Jesse Suen

No responses yet