Practical Argo Workflows Hardening

In this post, we’ll cover:
- High-level best practices you should know to secure your workflows.
- The various components that make up Argo, and how to secure those components.
- Dive into operating and using Argo securely.
Understand the security model of Argo is key both to building secure platforms on top of Argo, and for running secure workflows.
High-Level Best Practices
Before we jump into the details, here’s the high-level things you should be doing:
- Run the latest version so you have the latest security fixes.
- Make sure you have TLS and authentication enabled for the API.
- Limit ingress and egress using Kubernetes network policies.
- Apply the principle of least privilege, double-check you have not over-permissioned.
- Only use the
emissary
executor. - Use
workflow-controller-configmap
to set the default service accounts and security context. - Use Open Policy Agent to enforce them.

Argo Server
The Argo Server is the API and UI for Argo Workflows. Depending on configuration, it allows users to view and submit workflows. Misconfiguration can allow unauthenticated users to run workflows, e.g. to mine crypto.
It is the only component that exposes an API and is therefore the primary attack surface. Exactly the same recommendations for any API apply.
- Do you need this component? If not, remove it or scale-to-zero. This is the best way to write secure any app.
- Enable client and/or SSO authentication. This is the default in recent versions.
- Are you over-permissioning the server? Do you need to be able to submit or change workflows via the UI? If not, remove any write permissions from the
role/argo-server-role
. - Limit inbound and outbound requests from the server using network policies and security groups. Do not leave it open to the Internet.
- Make sure that you have enabled TLS v1.2 encryption, the default in recent versions. This prevents MITM attacks.
- Determine your multi-tenancy model. If you use SSO, use the SSO+RBAC feature to limit users by groups to just their own namespace.
The image argoproj/argocli
is a scratch image that runs as non-root, and out-of-the box has a secure security context.
Workflow Controller
The workflow controller is primarily responsible for scheduling workflows. It listens to workflows by connecting to the Kubernetes API, and then creates pods based on the workflow’s spec. It also creates workflows based on cron, and it’ll delete pods on clean-up. Depending on your configuration, it maybe set-up to do this for an entire cluster, or just a single namespace.
Key point: the workflow controller maybe able to create and delete pods in any namespace.
- If you only run workflows in one namespace, install using managed namespace configuration. This requires only namespace roles, not cluster roles.
- The controller only needs access to the Kubernetes API and any workflow archive you’ve configured. Use egress/security groups to limit other access.
- The only endpoint the workflow controller exposes is for Prometheus metrics. Limit ingress to your Prometheus collector.
The argoproj/worklow-controller
image is a scratch image that runs as non-root, and out-of-the box has a secure security context.
Configure A Secure Workflow Executor
The controller configuration determines the executor used to run the workflow. You should only use the emissary
executor because:
- The
emissary
never speaks to any external service and can run as non-root. - The
docker
executor is privileged and deprecated. - The
kubelet
executor speaks to the local Kubelet, opening up a new network route to localhost. - The
k8sapi
executor speaks to the Kubernetes API, so your pod cannot be network restricted. - The
pns
executor must run as root, and processes within the pod are not isolated as it uses shared process namespace.
Read more about sunsetting the Docker executor.
Workflow Pods
Workflow pods are created in the user namespace by the workflow controller and do the work requested by the workflow.
A workflow is like a small Kubernetes app, so you should secure it like an app.
The controller will only created pods in the namespace of the workflow, so the impact is somewhat contained to the workflows namespace.
Key point: any user that can create a workflow in a namespace, can also create pods in that namespace.
To secure workflow pods, we need to:
- Enforce an allowable list of images.
- Use a secure security context (run-as-non root).
- Make sure we are not over-permissioning the service account.
Argo does not provide the first two items, because it does not need to. You can achieve this using Open Policy Agent’s Kubernetes Admission Controller.
Untrusted User Code
What about if you don’t allow users to directly create workflows? Instead, you create them on their behalf, allowing the user to specify the code that the workflow runs?
This is still a problem. To understand why, we need to understand that each workflow pod has two containers:
The wait
container is responsible for:
- Uploading artifacts to your storage (which needs
get secrets
). - Reporting the output of the pod back to the controller (which uses
pod patch
in ≤v3.2 orcreate taskset
in ≥v3.3).
The main
container runs the user code.
By default, Kubernetes auto-mounts the service account token to all containers in a pod:
- Both containers get the same service account token.
- The user code will be able to patch pods in the user’s namespace (if you have given them that permission).
- If you can patch a pod, you can change the image or command.
Key point: user code in ≤v3.2 can change the image (etc) of any pod in the workflow’s namespace.
This can beworse if you don’t specify a service account. Like any Kubernetes app, you will be using the default service account, and it is easy for that account to get over-permissioned.
Key point: always specify a service account to run your workflow with, maybe delete the default service account.
It might be fine for your user to edit pods, e.g. if the user can already create pods because you trust the user. If the code running is untrusted, you’ll want to prevent this:
- Disable auto-mount.
- Specify a service account to be used by the workflow.
kubectl create sa workflowkubectl create rolebinding executor --serviceaccount=argo:workflow --role=workflow-role
Then you can run the workflow:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: zero-permissions-
spec:
entrypoint: main
automountServiceAccountToken: false
serviceAccountName: workflow
templates:
- name: main
container:
image: argoproj/argosay:v2
Security Context
By default, the user code runs as root (around 75% of containers do).
Why is this bad? Imagine your container was compromised (e.g. by Log4Shell), the the attacked could do this:
$ docker run - rm -ti python sh -c 'apt-get update'
Get:1 http://deb.debian.org/debian bullseye InRelease [116 kB]
…
Fetched 8506 kB in 2s (4930 kB/s)
Reading package lists… Done# Oh no!
But if they were non-root:
$ docker run --user=1000 --rm -ti python sh -c 'apt-get update'...
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
You don’t need to run the wait
container as root, unless you’re using the pns
executor. You can improve security for both the main
and wait
containers follows:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: security-context-
spec:
entrypoint: main
securityContext:
runAsNonRoot: true
runAsUser: 8737
templates:
- name: main
container:
image: argoproj/argosay:v2
If we want to drop capabilities and prevent privilege escalation, we can’t set that as a pod security context, we need to set that on the container’s security context.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: security-context-
spec:
entrypoint: main
templates:
- name: main
container:
securityContext:
runAsNonRoot: true
runAsUser: 8737
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
image: argoproj/argosay:v2
Setting Default Secure Configuration
That’s pretty verbose to add this config to every workflow. In ≥v3.3, you can make this the default in workflow-controller-configmap
:
apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
data:
workflowDefaults: |
spec:
executor:
serviceAccountName: workflow
securityContext:
runAsNonRoot: true
runAsUser: 8737
runAsGroup: 8737
serviceAccountName: workflow
automountServiceAccountToken: false
mainContainer: |
resources:
requests:
cpu: 1m
memory: 64M
limits:
cpu: 0.5
memory: 128Mi
securityContext:
runAsNonRoot: true
runAsUser: 8737
runAsGroup: 8737
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
executor: |
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 0.1
memory: 64Mi
limits:
cpu: 0.5
memory: 128Mi
securityContext:
runAsNonRoot: true
runAsUser: 8737
runAsGroup: 8737
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
Trust, but verify.
Run a workflow, use kube-linter
(or whatever Kubernetes security tool you like) on the pod manifests verify it works:
kubectl get pods -l workflows.argoproj.io/workflow -o yaml | kube-linter lint -
Note that all these can be overridden by the user for their workflow. So you should still deploy Open Policy Agent.
Conclusion
- Securing a workflow is just like securing a Kubernetes application, so standard best-practices, like minimsing permissions, or running as non-root, apply.
- User code creates a special case where you may need to take action to prevent users editing pods within the namespace.