Argo Workflows v3.0

Alex Collins
Argo Project
Published in
6 min readJan 28, 2021

--

We’re incredibly proud of how far Argo Workflows has come since its inception three years ago!

  • 17th Oct 2017 — first commit
  • 6th Feb 2018 — v2.0 rewritten in Go
  • 2nd Sep 2019 — first 1,000 stars
  • 17th Apr 2020 — became a CNCF incubator project
  • 22nd Jan 2021 — 373 contributors, 2k commits, 7.3k stars, 1.3k forks, 5.2k Slack members

With this all behind us — we’re round to announce Argo Workflows v3.0.

What is Argo Workflows?

Argo Workflows is a cloud-native workflow engine that can run 10,000s of concurrent workflows, each with 1,000s of steps.

What can I use it for?

  • Machine Learning
  • ETL, Data Analytics & Data Science
  • Data processing pipelines
  • Batch processing
  • Serverless
  • CI/CD

Who uses Argo?

Argo is used to “discover new physics” at CERN, for 3D rendering at CoreWeave (on a 1,000 node cluster with 6,000 GPUs), and in Intuit’s Machine Learning and Data Processing platforms. Argo Workflows is actively used in production by well over 100 organizations, including Adobe, Alibaba Cloud, BlackRock, Capital One, Data Dog, Datastax, Google, GitHub, IBM, Intuit, NVIDIA, SAP, New Relic, and RedHat.

Why would I use Argo?

When we asked our users who were using tools like Kubeflow, Apache Airflow, AWS Batch, AWS Lambda, KNative, TektonCD, and Jenkins why they also use Argo, they said they love that it is cloud-native, simple, fast, scales, and cost-effective.

Big new features every release

In the last 12 months, every release has had major new features:

So what’s new in Argo Workflows v3.0?

  • Major upgrade (20k new lines of code) to the user interface with many new features and much more robust
  • Brand new APIs for Argo Events
  • Controller High-Availability
  • Key-only artifacts make it easier to perform map-reduce operations
  • Moving the repository
  • Go modules support

Watch the video!

Argo Events API and UI

Argo Workflows v3.0 comes with a new UI that now also supports Argo Events! The UI is also more robust and reliable.

  • New API endpoints for Argo Events
  • New event-flow page
  • Create, edit, and view log event sources and sensors in the UI
  • Embeddable widgets
  • New workflow log viewer
  • Configurable “Get Help” button
  • More configurable link buttons (e.g. for linking into your logging facility)
  • Seamless reconnection on network errors
  • Refactored code to use more robust React functional components

The event-flow page allows users to understand how event sources and sensors are connected together, as well as linking in the workflows created by triggers, and displaying animations whenever a message is seen.

Event-flow

You can create and update event sources and sensors directly in the user interface using the same visual language we use for workflows:

Event Sources

We’ve added some simple widgets you can use to embed the status and progress of a workflow or the latest workflow created by a workflow template or cron workflow:

Widgets

Rather than editing your workflow by hand, you can also submit from a template:

Workflow Creator

The log viewer has been updated to allow you to view the init and wait containers easier (helping debug artifact issues). It also allows you to tail the whole workflow:

Log Viewer

If you want to try it yourself, you can take a look around the test environment.

We have an extensive demo video you can watch online from January’s community meeting (starts at 41m):

Controller High-Availability

The v3.0 release introduces a hot-standby workflow controller feature for high availability and quick recovery by leveraging the Kubernetes leader election feature. The default install enables leader election and one has a pod, which is the leader. Whenever a controller pod crashes, Kubernetes will restart it. To reduce startup time, you can now run two pods. The second pod will be on hot-standby and take over immediately if the leader dies.

kubectl scale deployment/workflow-controller --replicas=2 

Key-Only Artifacts

Argo Workflows v3.0 introduces a default artifact repository reference and key-only artifacts, two new features that work together.

  • Users can configure a default artifact repository for their namespace rather than having to define it explicitly for each workflow.
  • Workflow specifications do not need to provide non-key fields (e.g. bucket, username/password secret key). They can use just the key (hence “key-only”), and the non-key fields will be inherited from the artifact repository.
  • Users can specify the key to reference artifacts globally without using parameterized inputs and outputs.
  • Easier to specify fan-in artifact patterns, simplifying map-reduce style workflows.

As a consequence, we no longer need to replicate non-key elements in manifests, reducing the disk-space needed for workflows.

New Repository Location

We’ll be renaming the Argo workflow repository to argo-workflowsrather than argo. The new name makes it clear that this is the repo for Argo Workflows and not the overall Argo Project.

Github automatically forwards when a repository is renamed, so users should not be significantly impacted.

Go Modules + Go Client v1.19

In 2020, we migrated to Go modules. Unfortunately, migrating to Go modules is a breaking change and we never completed the work, and it was still not possible to go get github.com/argoproj/argo without some hackery. Release v3 will fix this.

v2.12 Long-term Support

We plan to provide long-term support for v2.12. There will be bug fixes, but no new features, for 6+ months.

What we expect to back-port:

  • Bug fixes.
  • Changes to complete features new in v1.12 (e.g SSO+RBAC).

We don’t plan to back-port:

  • UI bug fixes that are based on refactoring that is unique to v3.0. But you can run the v3.0 UI with the v2.12 controller.
  • New features.

What’s next?

Argo Workflows v3.1 will contain enhancement to make it easier to write fan-out-fan-in workflows using artifacts, and well as conditional artifacts.

Nothing as big as this is the work of one person, so beyond the core team, we must recognize these major contributors:

  • Daisuke Taniwaki — Preferred Networks
  • Yuan Tang — Ant Group
  • Mark White
  • Daniel Herman
  • Sam Elder — Kebotix
  • Michael Crenshaw — Colaberry/CCRi
  • Xianlu Bird — Aliyun
  • Peter Salanik — CoreWeave
  • J.P. Zivalich — Pipekit
  • Niklas Hansson — Sandvik CODE
  • Antoine Dao — Pollination
  • Clemens Lange — CERN
  • Vaibhav Page — Blackrock
  • Sumit Nagal — Intuit
  • David Breitgand — IBM

--

--