Practical Argo Workflows Hardening

Alex Collins

Published in

Argo Project

6 min readMar 17, 2022

In this post, we’ll cover:

High-level best practices you should know to secure your workflows.
The various components that make up Argo, and how to secure those components.
Dive into operating and using Argo securely.

Understand the security model of Argo is key both to building secure platforms on top of Argo, and for running secure workflows.

High-Level Best Practices

Before we jump into the details, here’s the high-level things you should be doing:

Run the latest version so you have the latest security fixes.
Make sure you have TLS and authentication enabled for the API.
Limit ingress and egress using Kubernetes network policies.
Apply the principle of least privilege, double-check you have not over-permissioned.
Only use the emissary executor.
Use workflow-controller-configmap to set the default service accounts and security context.
Use Open Policy Agent to enforce them.

Argo Server

The Argo Server is the API and UI for Argo Workflows. Depending on configuration, it allows users to view and submit workflows. Misconfiguration can allow unauthenticated users to run workflows, e.g. to mine crypto.

It is the only component that exposes an API and is therefore the primary attack surface. Exactly the same recommendations for any API apply.

Do you need this component? If not, remove it or scale-to-zero. This is the best way to write secure any app.
Enable client and/or SSO authentication. This is the default in recent versions.
Are you over-permissioning the server? Do you need to be able to submit or change workflows via the UI? If not, remove any write permissions from therole/argo-server-role .
Limit inbound and outbound requests from the server using network policies and security groups. Do not leave it open to the Internet.
Make sure that you have enabled TLS v1.2 encryption, the default in recent versions. This prevents MITM attacks.
Determine your multi-tenancy model. If you use SSO, use the SSO+RBAC feature to limit users by groups to just their own namespace.

The image argoproj/argocli is a scratch image that runs as non-root, and out-of-the box has a secure security context.

Workflow Controller

The workflow controller is primarily responsible for scheduling workflows. It listens to workflows by connecting to the Kubernetes API, and then creates pods based on the workflow’s spec. It also creates workflows based on cron, and it’ll delete pods on clean-up. Depending on your configuration, it maybe set-up to do this for an entire cluster, or just a single namespace.

Key point: the workflow controller maybe able to create and delete pods in any namespace.

If you only run workflows in one namespace, install using managed namespace configuration. This requires only namespace roles, not cluster roles.
The controller only needs access to the Kubernetes API and any workflow archive you’ve configured. Use egress/security groups to limit other access.
The only endpoint the workflow controller exposes is for Prometheus metrics. Limit ingress to your Prometheus collector.

The argoproj/worklow-controller image is a scratch image that runs as non-root, and out-of-the box has a secure security context.

Configure A Secure Workflow Executor

The controller configuration determines the executor used to run the workflow. You should only use the emissary executor because:

The emissary never speaks to any external service and can run as non-root.
The docker executor is privileged and deprecated.
The kubelet executor speaks to the local Kubelet, opening up a new network route to localhost.
The k8sapi executor speaks to the Kubernetes API, so your pod cannot be network restricted.
The pns executor must run as root, and processes within the pod are not isolated as it uses shared process namespace.

Workflow Pods

Workflow pods are created in the user namespace by the workflow controller and do the work requested by the workflow.

A workflow is like a small Kubernetes app, so you should secure it like an app.

The controller will only created pods in the namespace of the workflow, so the impact is somewhat contained to the workflows namespace.

Key point: any user that can create a workflow in a namespace, can also create pods in that namespace.

To secure workflow pods, we need to:

Enforce an allowable list of images.
Use a secure security context (run-as-non root).
Make sure we are not over-permissioning the service account.

Argo does not provide the first two items, because it does not need to. You can achieve this using Open Policy Agent’s Kubernetes Admission Controller.

Untrusted User Code

What about if you don’t allow users to directly create workflows? Instead, you create them on their behalf, allowing the user to specify the code that the workflow runs?

This is still a problem. To understand why, we need to understand that each workflow pod has two containers:

The wait container is responsible for:

Uploading artifacts to your storage (which needs get secrets).
Reporting the output of the pod back to the controller (which uses pod patch in ≤v3.2 or create tasksetin ≥v3.3).

The main container runs the user code.

By default, Kubernetes auto-mounts the service account token to all containers in a pod:

Both containers get the same service account token.
The user code will be able to patch pods in the user’s namespace (if you have given them that permission).
If you can patch a pod, you can change the image or command.

Key point: user code in ≤v3.2 can change the image (etc) of any pod in the workflow’s namespace.

This can beworse if you don’t specify a service account. Like any Kubernetes app, you will be using the default service account, and it is easy for that account to get over-permissioned.

Key point: always specify a service account to run your workflow with, maybe delete the default service account.

It might be fine for your user to edit pods, e.g. if the user can already create pods because you trust the user. If the code running is untrusted, you’ll want to prevent this:

Disable auto-mount.
Specify a service account to be used by the workflow.

kubectl create sa workflowkubectl create rolebinding executor --serviceaccount=argo:workflow --role=workflow-role

Then you can run the workflow:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: zero-permissions-
spec:
  entrypoint: main
  automountServiceAccountToken: false
  serviceAccountName: workflow
  templates:
    - name: main
      container:
        image: argoproj/argosay:v2

Security Context

By default, the user code runs as root (around 75% of containers do).

Why is this bad? Imagine your container was compromised (e.g. by Log4Shell), the the attacked could do this:

$ docker run - rm -ti python sh -c 'apt-get update'
Get:1 http://deb.debian.org/debian bullseye InRelease [116 kB]
…
Fetched 8506 kB in 2s (4930 kB/s)
Reading package lists… Done# Oh no!

But if they were non-root:

$ docker run --user=1000 --rm -ti python sh -c 'apt-get update'...
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)

You don’t need to run the wait container as root, unless you’re using the pns executor. You can improve security for both the main and wait containers follows:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: security-context-
spec:
  entrypoint: main
  securityContext:
    runAsNonRoot: true
    runAsUser: 8737
  templates:
    - name: main
      container:
        image: argoproj/argosay:v2

If we want to drop capabilities and prevent privilege escalation, we can’t set that as a pod security context, we need to set that on the container’s security context.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: security-context-
spec:
  entrypoint: main
  templates:
    - name: main
      container:
        securityContext:
          runAsNonRoot: true
          runAsUser: 8737
          allowPrivilegeEscalation: false
          capabilities:
            drop:
              - ALL
        image: argoproj/argosay:v2

Setting Default Secure Configuration

That’s pretty verbose to add this config to every workflow. In ≥v3.3, you can make this the default in workflow-controller-configmap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: workflow-controller-configmap
data:
  workflowDefaults: |
    spec:
      executor:
        serviceAccountName: workflow
      securityContext:
        runAsNonRoot: true
        runAsUser: 8737
        runAsGroup: 8737
      serviceAccountName: workflow
      automountServiceAccountToken: false
  mainContainer: |
    resources:
      requests:
        cpu: 1m
        memory: 64M
      limits:
        cpu: 0.5
        memory: 128Mi
    securityContext:
      runAsNonRoot: true
      runAsUser: 8737
      runAsGroup: 8737
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
  executor: |
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: 0.1
        memory: 64Mi
      limits:
        cpu: 0.5
        memory: 128Mi
    securityContext:
      runAsNonRoot: true
      runAsUser: 8737
      runAsGroup: 8737
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL

Trust, but verify.

Run a workflow, use kube-linter (or whatever Kubernetes security tool you like) on the pod manifests verify it works:

kubectl get pods -l workflows.argoproj.io/workflow -o yaml | kube-linter lint -

Note that all these can be overridden by the user for their workflow. So you should still deploy Open Policy Agent.

Conclusion

Securing a workflow is just like securing a Kubernetes application, so standard best-practices, like minimsing permissions, or running as non-root, apply.
User code creates a special case where you may need to take action to prevent users editing pods within the namespace.

Argo Project

Practical Argo Workflows Hardening

High-Level Best Practices

Argo Server

Workflow Controller

Configure A Secure Workflow Executor

Workflow Pods

Untrusted User Code

Security Context

Setting Default Secure Configuration

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Argo Project

Written by Alex Collins

Responses (1)