Using GitOps to Deploy Kubeflow with Argo CD

Danny Thomson
Argo Project
Published in
7 min readOct 26, 2018

--

Kubeflow is an open source project for making machine learning (ML) on Kubernetes simple, portable, and scalable. Argo CD is a GitOps-based Continuous Delivery tool for Kubernetes. With Argo CD, you specify the desired state of your applications on Kubernetes using declarative specifications and Argo CD will reconcile the differences between the desired state and the actual live state in your Kubernetes cluster.

This blog post shows how you can get the benefits of GitOps for Kubeflow by using Argo CD as a deployment tool. In a future blog post, we will show you how you can extend these benefits by using Kubeflow with Argo CD to train and deploy your machine learning models. Essentially, GitOps for ML!

The rest of the article will focus on walking through the steps to bring up a GKE cluster and using Argo CD to deploy Kubeflow from a Github repo. It mostly follows the regular Kubeflow GKE Getting Started Guide with slight variations to install Argo CD and setup your Github repo.

Please check out the demo at the recent Kubeflow Community Meeting.

Creating OAuth Credentials

The first step in setting up your GKE cluster with Kubeflow is creating your OAuth client credentials by following the steps in the GKE Getting Started Guide. You should now have two environmental variables.

export CLIENT_ID=<CLIENT_ID from OAuth page>
export CLIENT_SECRET=<CLIENT_SECRET from OAuth page>

Creating the GKE Cluster

Kubeflow has a CLI tool called kfctl.sh to simplify the GKE cluster process. The CLI tool uses the Deployment Manager to declaratively manage all non K8s resources. Since the Deployment Manager has a declarative specification, you can to customize your GKE cluster as per their requirements.

In order for the kfctl.sh tool to deploy the configs to the Deployment Manager, the Google Cloud SDK and kubectl binary need to be installed. You can install the Google Cloud SDK with the instructions here and install kubectl by running gcloud components install kubectl. For this demo, I decided to deploy my GKE cluster to us-central-1c, and I ran gcloud config set compute/zone us-central1-c to set the zone.

To download the kfctl.sh tool, you can run

mkdir ${KUBEFLOW_REPO}
cd ${KUBEFLOW_REPO}
export KUBEFLOW_TAG=v0.3.1
curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash
  • KUBEFLOW_REPO path to a directory where you want to download the source to. (i.e. /tmp/kubeflow/)
  • KUBEFLOW_TAG a tag corresponding to the version to checkout such as master for latest code. You can find the latest version here.
  • Alternatively, you can also just clone the Kubeflow repository using git.

Now that you have the kftcl.sh downloaded, you can use it to create the config to your GKE cluster by running:

./scripts/kfctl.sh init ${KFAPP} --platform gcp --project ${PROJECT}
cd ${KFAPP}
${KUBEFLOW_REPO}/scripts/kfctl.sh generate platform gcp
${KUBEFLOW_REPO}/scripts/kfctl.sh apply platform gcp
  • KFAPP The name of a directory to store your configs. This directory will be created when you run init and will become your Github repo.
  • PROJECT Your Google Cloud project name

Once the kfctl.sh apply platform gcp command finishes, you will have a GKE cluster provisioned for Kubeflow. If your deployment fails, I recommend checking this GKE troubleshooting guide.

Commit your Kubeflow Ksonnet to Github

In the KUBEFLOW_REPO, you will need to run the following command to generate the Ksonnet components that define your Kubeflow application.

${KUBEFLOW_REPO}/scripts/kfctl.sh generate k8s

If you would like to learn more on Ksonnet components, I recommend checking out the Kubeflow documentation on ksonnet.

Next, we need to add a ksonnet environment to our directory. This command requires the ksonnet binary ks to be installed on your machine. You can follow the directions here to install ksonnet. Once you have ksonnet setup, you can run:

cd ${KUBEFLOW_REPO}/${KFAPP}/ks_app/
ks env add default --server https://kubernetes.default.svc

Since I am deploying Argo CD to the same Kubernetes cluster as Kubeflow, I added the --server https://kubernetes.default.svc flag to set the API server endpoint for this environment to the internal Kubernetes API server endpoint. Two nice side effects with setting our ksonnet environment like this are that Argo CD can deploy Kubeflow without having to go to the external internet and developers can not run the ksonnet apply command without adding extra flags to specify the true API endpoint.

At this point, your directory is ready to be committed to Github. Please follow your regular methods to push up your changes. For example, if you were using SSH based access, you could run within the KFAPP directory:

cd ${KFAPP}
git init
git add --all
git commit -m "Initial Kubeflow ksonnet files"
git remote add origin git@github.com:argoproj/kubeflow-ks.git
git push --set-upstream origin master

NOTE: This will also store your GKE configs in the github repo.

You can take a look at https://github.com/argoproj/kubeflow-ks.git to see an example of the ksonnet files that you should have in your repo with the caveat that the repo does not have the gcp_config folder.

Install Argo CD

With the Kubeflow ksonnet Github repo and GKE cluster ready, we now need to install Argo CD. For reference, most of the following steps follow the Argo CD getting started guide.

Install Argo CD to the Kubernetes Cluster

ARGO_CD_LATEST=$(curl --silent "https://api.github.com/repos/argoproj/argo-cd/releases/latest" | grep '"tag_name"' | sed -E 's/.*"([^"]+)".*/\1/')kubectl create namespace argocdkubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/${ARGO_CD_LATEST}/manifests/install.yaml

These three lines download the latest released Argo CD manifest files and install them into the argocd namespace

Install Argo CD CLI

If you use macOS, you can run:

brew install argoproj/tap/argocd

On Linux:

curl -sSL -o /usr/local/bin/argocd https://github.com/argoproj/argo-cd/releases/download/${ARGO_CD_LATEST}/argocd-linux-amd64
chmod +x /usr/local/bin/argocd

Set Extra Argo CD Permissions

Since the GKE cluster has RBAC enabled, you will need to grant your account the ability to create new cluster roles by running:

kubectl create clusterrolebinding YOURNAME-cluster-admin-binding --clusterrole=cluster-admin --user=YOUREMAIL@gmail.com

Expose the Argo CD API server

By default, the default manifest files for Argo CD does not expose a public endpoint. Here we will connect to Argo CD using the kubectl port-forwarding and expose Argo CD over port 8080

kubectl port-forward service/argocd-server 8080:443

You can read about other options to connect to your Argo CD instance here.

Login via the CLI as admin user

The initial password for the admin user is autogenerated to be the pod name of the Argo CD API server. This can be retrieved with the command:

kubectl get pods -n argocd -l app=argocd-server -o name | cut -d'/' -f 2

Using the above password, login to Argo CD as admin over localhost:8080:

argocd login localhost:8080

After logging in, change the password using the command:

argocd account update-password
argocd relogin

Deploying Kubeflow

We’re finally ready to deploy Kubeflow using Argo CD!

export KUBEFLOW_REPO_URL='Replace with a ssh or https git endpoint'
argocd app create kubeflow --name kubeflow --repo $KUBEFLOW_REPO_URL --path ks_app --env default
argocd app sync kubeflow

You can now view the Kubeflow application by running:

argocd app get kubeflow

or take a look at the application using the Argo CD GUI by pointing your browser to the IP address for the Argo CD service running on your cluster:

NOTE: There is a known issue with the IAP component that prevents the envoy service from becoming synced and causes all subsequent syncs to fail. As a workaround for this issue, we recommend that you sync individual resources by adding the resource flag to your sync command.

You can then access your Kubeflow UI by going to https://<KFAPP>.endpoint.<PROJECT>.cloud.googl/

  • It can take 10–15 minutes for the endpoint to become available. Kubeflow needs to provision a signed SSL certificate and register a DNS name.

Further Kubeflow Customizations

When you commit a change that modifies the Ksonnet application directory of your Kubeflow repository (the ks_app directory if you used the kfctl.sh script), Argo CD will detect that your application is out of sync with your git repo. To sync the new resource, you can run

argocd app sync kubeflow --resource GROUP:KIND:NAME

or from the UI:

Further Argo CD configurations

Check out the Argo CD documentation to read more about how to configure other features like auto-sync, SSO, RBAC, service tokens, and more.

Next Steps

While we can benefit from all the advantages GitOps offers by deploying Kubeflow with Argo CD, this integration still has a lot of room for improvement. Just as an example, the installation of Argo CD does not quite follow a GitOps model. The instructions tell users to install the Argo CD manifests from a remote URL instead of pulling them from a Git repo. To be more in line with a GitOps solution, users should have a more declarative installation of Argo CD (this issue is being tracked here). The Argo CD team is excited to work with Kubeflow team to create a deeper integration that betters both tools.

Please check out the Argo CD code base, and send any feedback or questions to the#argo-cd channel in the argoproj slack channel!

--

--