Argo Rollouts v1.2

Jesse Suen
Argo Project
Published in
4 min readMar 22, 2022
Photo by Stephen Dawson

The Argo team is happy to announce the general availability of Argo Rollouts v1.2. Argo Rollouts is a Kubernetes progressive delivery operator providing advanced blue-green and canary deployment strategies with automated promotions, rollbacks, and metric analysis. Read on for some of the highlights of the release.

Analysis Dry Run

One of the more challenging aspects of adopting progressive delivery is building sufficient trust in your metrics to allow them to block an update. Every service is different and may have its unique requirements. For example, the latency tolerations of one application may be very different than another or even change over time. It typically takes practice, and fine-tuning of your analysis queries before they can be trusted to abort an update.

To help ease this transition, an analysis “dry run” option has been introduced. During an update, an analysis metric marked to run in a dry-run mode will execute normally, but the failure of that metric will not cause the rollout to abort. Users will still see that the analysis passed or failed so that the next time they can either tweak the query or do a wet run. To use the dry run feature, specify a list of metric names or a regular expression to the dryRun field:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: dry-run-rollout
spec:
strategy:
canary:
steps:
- analysis:
templates:
- templateName: success-rate
dryRun:
- metricName: .*

This allows you to practice and fine-tune your analysis in a real-world environment, before actually letting it gate your updates.

Weighted Experiment Steps

Ever since the Experiment feature was first introduced, it has been possible to run Experiments as a step in the rollout. Experiments are helpful for some advanced canarying techniques where traffic splitting is not sufficient (e.g., A/B testing or baseline/canary analysis). However, this feature never leveraged the fine-grained traffic weight splitting capabilities of service meshes and ingress controllers, which limited its usefulness to just the “basic” canary strategy. Now with Rollouts v1.2, it is possible to specify a weight percentage in an experiment step:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: weighted-experiment-rollout
spec:
strategy:
canary:
steps:
- experiment:
duration: 1h
templates:
- name: baseline
specRef: stable
weight: 2
- name: canary
specRef: canary
weight: 2

In the above example, during the update, the rollout will run a one hour experiment, where 2% of the traffic will go to a new canary pod, another 2% will go to a new baseline pod, and the remaining 96% will go to the existing stable pods. This feature makes it possible to run smaller, safer experiments, decreasing the blast radius of a potential bad update.

Ping-Pong Service Management

Rollout users using the AWS LoadBalancer controller and its Pod readiness gates feature may have noticed an issue with pod readiness gates not being injected expectedly. This is due to the way AWS LoadBalancer controller decides to inject readiness gates based on Service selector labels at the time of pod creation. In short, modifying a Service selector label to point to a different pod that is already running does not allow the proper readiness gates to be injected. Since Rollouts works by modifying the “stable” selector after a successful update, it prevented readiness gates from being injected. Note that this was also the motivation for the target group verification feature introduced in the previous Rollouts v1.1 release.

With Rollouts v1.2, we are providing an alternative way of Service management, named “Ping-Pong” service management, which is compatible with the way AWS Load Balancer works. Using this approach, it is no longer necessary to specify a “canary” and “stable” service, specifying a “ping” and “pong” service, instead:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: ping-pong-rollout
spec:
strategy:
canary:
pingPong:
pingService: ping-service
pongService: pong-service
trafficRouting:
alb:
ingress: alb-ingress
servicePort: 80

During updates, the rollout controller alternate which service will act as the canary (ping vs. pong) and modifies service selectors before creating the new canary pods. On every update, traffic will shift back and forth between the ping and pong service (thus the name ping-pong), and the role of the canary would also change between the ping and pong services. The technique of modifying service selectors before pod creation ensures the proper pod readiness gates get injected by the AWS LoadBalancer controller and helps ensure zero-downtime updates with ALBs.

Other Notable Features

This release also included a slew of other significant improvements, including:

  • AWS AppMesh Traffic Routing Support
  • High Availability (active-passive) using Kubernetes leader election
  • Support for multiple/simultaneous traffic providers
  • Customizable Metric Retention
  • Support for PUT/POST in web metric providers
  • Additional metadata from analysis providers (i.e., for debugging)
  • Scalability & performance improvements

Ready to get started with Rollouts v1.2? Download the installation manifests from our GitHub v1.2 release page.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Jesse Suen

Co-founder and CTO at Akuity | Argo Project Leader with 6-year tenure | ex-Intuit | ex-Applatix | Running the Enterprise Company for Argo — https://akuity.io

No responses yet

What are your thoughts?