The State of Kubernetes Configuration Management: An Unsolved Problem

Jesse Suen
Argo Project
Published in
12 min readFeb 7, 2019

--

Of all the problems we have confronted, the ones over which the most brainpower, ink, and code have been spilled are related to managing configurations.
Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes, Google Inc.

Configuration management is a hard, unsolved problem. When we first started Argo CD, a GitOps deployment tool for Kubernetes, we knew we had to limit its scope to a deployment tool and not go anywhere near config management. We understood that since there was no perfect config management solution, Argo CD should remain agnostic to how kubernetes manifests are rendered, and let the user decide for themselves the right tool and workflow that works best for them.

Good Kubernetes configuration tools have the following properties:

  • Declarative. The config is unambiguous, deterministic and not system dependent.
  • Readable. The config is written in a way that is easy to understand.
  • Flexible. The tool helps facilitates, and does not get in the way of accomplishing what you are trying to do.
  • Maintainable. The tool should promote reuse and composability.

A couple of key reasons Kubernetes config management is so challenging: what sounds like a simple act of deploying an application, can have wildly different, even opposing requirements, and it’s difficult for a single tool to accommodate all such requirements. Imagine the following use cases:

  • A cluster operator who deploys 3rd-party, off-the-shelf applications, such as Wordpress, to their cluster with little to no customization of those apps. The most important criteria for this user is to easily receive updates from an upstream source and upgrade their application as easily and seamlessly as possible (e.g. new versions, security patches, etc…).
  • A SaaS application developer who deploys their bespoke application to one or more environments (e.g. dev, staging, prod-west, prod-east). These environments may be spread across different accounts, clusters, and namespaces with subtle differences between them, so configuration re-use is paramount. For these users, it is important to go from a Git commit in their code base to deploying to each of their environments in a fully automated way, and manage the configuration of their environments in a straightforward and maintainable way. These developers have zero interest in semantic versioning of their releases since they might be deploying multiple times a day, and the notion of a major, minor and patch versions ultimately have no meaning for their application.

As you can see, these are completely different use cases, and more often than not, a tool which excels at one, doesn’t handle the other very well. After having built first class support in Argo CD for a few of the more popular config tools (Helm, kustomize, ksonnet, jsonnet), and having used these tools at Intuit to manage various applications in our clusters, we’ve accumulated some unique insights about the strengths and weaknesses of each.

Helm

Let’s start with the obvious one, Helm, which needs no introduction. Love it or hate it, Helm, being the first one on the scene, is an integral part of the Kubernetes ecosystem, and chances are that at one point or another you’ve installed something by runninghelm install.

The important thing note about Helm is that it is a self-described package manager for Kubernetes, and doesn’t claim to be a configuration management tool. However, since many people use Helm templating for exactly this purpose, it belongs in the discussion. These users invariably end up maintaining several values.yaml, one for each environment (e.g. values-base.yaml, values-prod.yaml, values-dev.yaml), then parameterize their chart in such a way that environment specific values can be used in the chart. This method more or less works, but it makes the templates unwieldy, since golang templating is flat, and needs to support every conceivable parameter for each environment, which ultimately litters the entire template with {{-if / else}} switches.

The Good:

There’s a chart for that. Undoubtedly, Helm’s biggest strength is its excellent chart repository. Just recently, we had the need to run a highly available Redis, without a persistent volume, to be used as a throwaway cache. There’s something to be said about just being able to throw the redis-ha chart into your namespace, set persistentVolume.enabled: false, point your service at it, and someone else has already done the hard work of figuring out how to run Redis reliably on a Kubernetes cluster.

The Bad:

Golang templating. “Look that that beautiful and elegant helm template!”, said no one ever. It is well known that Helm templates suffer from a readability problem. I don’t doubt that this will be addressed with Helm 3’s support for Lua, but until then, well, I hope you like curly braces.

Complicated SaaS CD pipelines. For SaaS CI/CD pipelines, assuming you are using Helm the way it is intended (i.e. using Tiller), an automated deploy in your pipeline might go several ways. In the best case, deploying from your pipeline will be as simple as:

docker push mycompany/guestbook:v2
helm upgrade guestbook --set guestbook.image.tag=v2

But in the worst case, where existing chart parameters cannot support your desired manifest changes, you go through a whole song and dance of bundling a new a Helm chart, bumping its semvers, publishing it to a chart repository, and redeploying with a helm upgrade. In the Linux world, this is analogous to building a new RPM, publishing the RPM to a yum repository, then running yum install, all so you can get your shiny new CLI into /usr/bin. While this model works great for packaging and distribution, in the case of bespoke SaaS applications, it’s an unnecessarily complex and a roundabout way to deploy your applications. For this reason, many people choose to run helm template and pipe the output to kubectl apply, but at that point you are better off using some other tool that is specifically designed for this purpose.

Non-declarative by default. If you ever added --set param=value to any one of your Helm deploys, I’m sorry to tell you that your deployment process is not declarative. These values are only recorded in the Helm ConfigMap netherworld (and maybe your bash history), so hopefully you wrote those down somewhere. This is far from ideal if you ever need to recreate your cluster from scratch. A slightly better way would be to record all parameters in a new custom values.yaml which you can store in Git and deploy using -f my-values.yaml. However, this is annoying when you’re deploying an OTS chart from Helm stable, and you don’t have an obvious place to store that values.yaml side-by-side to the relevant chart. The best solution that I’ve come up with, is to compose a new dummy chart which has the upstream chart as a dependency. Still, I have yet to find a canonical way of updating a parameter in a values.yaml in a pipeline using a one-liner, short of running sed.

Kustomize

Kustomize was created around the design principles described in Brian Grant’s excellent dissertation regarding Declarative Application Management. Kustomize has seen a meteoric rise in popularity, and in the eight months since it started, has already been merged into kubectl. Whether or not you agree with the manner in which it was merged, it goes without saying that kustomize applications will now have a permanent mainstay in the Kubernetes ecosystem and will be the default choice that users will gravitate towards for config management. Yes, it helps to be part of kubectl!

The Good:

No parameters & templates. Kustomize apps are extremely easy to reason about, and I dare say, a pleasure to look at. It’s about as close as you can get to Kubernetes YAML since the overlays that you compose to perform customizations are basically subsets of Kubernetes YAML.

The Bad:

No parameters & templates. The same property that makes kustomize applications so readable, can also make it very limiting. For example, I was recently trying to get the kustomize CLI to set an image tag for a custom resource instead of a Deployment, but was unable to. Kustomize does have a concept of “vars,” which look a lot like parameters, but somehow aren’t, and can only be used in Kustomize’s sanctioned whitelist of field paths. I feel like this is one of those times when the solution, despite making the hard things easy, ends up making the easy things hard.

Jsonnet

Jsonnet is actually a language and not really a “tool.” Furthermore, its use is not specific to Kubernetes (although it’s been popularized by Kubernetes). The best way to think of jsonnet is as super-powered JSON combined with a sane way to do templating. Jsonnet combines all the things you wish you could do with JSON (comments, text blocks, parameters, variables, conditionals, file imports), without any of the things that you hate about golang/Jinja2 templating, and adds features that you didn’t even know you needed or wanted (functions, object orientation, mixins). It does all of this in a declarative and hermetic (code as data) way.

Jsonnet is not widely adopted in the Kubernetes community, which is unfortunate, because of all the tools described here, jsonnet is hands down the most powerful configuration tools available and is why several offshoot tools are built on-top of it. More on this later. Explaining what’s possible with Jsonnet is a post in and of itself, which is why I encourage you to read how Databricks uses Jsonnet with Kubernetes, and Jsonnet’s excellent learning tutorial.

The Good:

Extremely powerful. It’s rare to hit a situation which couldn’t be expressed in some concise and elegant snippet of jsonnet. With jsonnet, you are constantly finding new ways to maximize re-use and avoid repeating yourself.

The Bad:

It’s not YAML. This might just be an issue with unfamiliarity, but most people will experience some level of cognitive load when they’re staring at a non-trivial jsonnet file. In the same way that you would need to run helm template to verify your Helm chart is producing what you expect, you will similarly need to run jsonnet --yaml-stream guestbook.jsonnetto verify your jsonnet is correct. The good news is that, unlike golang templating which can produce syntactically incorrect YAML due to some misplaced whitespace, with jsonnet these type of errors are caught during build and the resulting output is guaranteed to be valid JSON/YAML.

Ksonnet (and other jsonnet derivatives)

Ksonnet (wordplay on the language for which it is based upon) was supposed to be the “jsonnet for Kubernetes.” It provided an opinionated way to organize your jsonnet manifests into files & directories of “components” and “environments,” backed by a CLI to help facilitate management of these files. Ksonnet made a big splash nearly two years ago when it was jointly announced by Heptio and Bitnami, working in conjunction with Microsoft and Box, a veritable who’s who of the Kubernetes ecosystem. Fast forward to today, and H̵e̵p̵t̵i̵o̵ VMware announces that the ksonnet project is now being sunsetted.

So what happened? Simply put, it was too hard to use. When starting with ksonnet, you were actually learning three things all at the same time: 1. the jsonnet language itself. 2. ksonnet’s over-engineered concepts (components, prototypes, environments, parts, registries, modules). 3. ksonnet-lib, ksonnet’s k8s jsonnet library. And if you were new to Kubernetes (as our dev teams were), make that four. Argo CD started with initial support for ksonnet and as someone who pushed for ksonnet adoption at Intuit, I am sorry to see it go. Despite my own efforts to make ksonnet easier for teams, I witnessed first-hand the continued struggles users faced with the tool.

Aside from ksonnet, there are a fair number of other jsonnet derived tools. These include kubecfg, kapitan, kasane, kr8. I don’t have first-hand experience with them, but they are definitely worth a look.

Replicated Ship

Ship, by replicated is relatively new to the scene, and focuses primarily on the problem of “last-mile customization” of third-party applications. It works by using both Helm and Kustomize to generate manifests. Why would you need to do this, you ask? Sooner or later, you will encounter a situation where a Helm chart is almost what you want, but you still need to tweak it somehow (e.g. add Network Policies, set Pod Affinity, etc…). After reviewing the built-in chart parameters, you realize that the chart doesn’t provide an option for what you’re trying to do. At this point people usually do one of two things: submit a PR upstream to add yet another parameter in the chart, or dump the contents of helm template into a file and hand-edit the YAML with their desired changes. The problem with the latter is that it’s unmaintainable. When the time comes to get the latest from upstream, you can’t easily discern or reapply the modifications you made in your fork. Ship solves this by keeping these separate: a tracking reference and staging directory hold the upstream helm chart, alongside your last-mile modifications written as kustomization overlays. Using this technique, the base manifests which you receive from upstream can be updated independently from your local kustomizations.

The Good:

It fills a gap. Ship makes it dead simple for operators to custom tailor a chart without having to a push change upstream. There’s even a wonderful UI to help guide you through this process.

The Bad:

Only works for OTS. Ship is only intended to be used with an off-the-shelf, upstream source, which means it does not handle the bespoke application config use case at all.

We shouldn’t need Ship. I find it unfortunate that we even need a tool like Ship. This is not a problem with Ship itself, but a commentary about the shortcomings of our current tools. Imagine, for example, if Helm provided a built-in way to apply simple overlays to upstream charts (as an alternative to parameters), we would end up with much simpler charts without the mess of overly parameterized templates. Another scenario is to imagine a world where everyone provided kustomize apps for their projects. If this was prevalent, then users could use kustomize’s remote base feature to apply local changes against upstream kustomize apps. But unless either of those things happen, Ship is a tool to bridge that gap.

Helm 3 and Lua Script

I’m quite optimistic about the proposal for Helm 3 and its use of Lua based charts. Unfortunately, this aspect of Helm 3 is one of the least developed, and the maintainers have only just begun writing the underlying Lua VM which will power the rendering engine, so it will be quite some time before we’ll have more readable charts. I do hope that they can address the last-mile customization gap, and be more friendly to GitOps style of deployments with the Helm 3 redesign.

Kubernetes configuration management is at an inflection point. With kustomize now readily available at user’s fingertips, it is easier than ever to guide users towards an extremely capable configuration management out-of-the-box. But if there’s a single takeaway from this discussion, it is that there is no perfect configuration management tool and they tend to come and go as the wind blows. Each tool will have its strengths and weaknesses, so it’s important to understand when to use the right one for the job. At this moment, we have a mix of apps defined in kustomize, Helm, ksonnet, all in the same cluster for different reasons. With Argo CD, a guiding principle has been to provide users the most amount of flexibility in this regard, such as facilities to customize the repo server, and supporting the ability to execute custom commands to generate manifests.

The state of Kubernetes config management has never been more exciting (as exciting as writing text to generate more text can be :-). Kustomize’s merge into kubectl changes the game entirely. Ksonnet’s exit from the market is a sign of the space maturing. Jsonnet will always have a place for the power users. And while Helm has lost some community sentiment as of late with Tiller security concerns and template complexity, it does have a wildcard up its sleeve with Helm 3 and upcoming Lua charts. At the very least, Kubernetes configuration management is a rapidly evolving space which affects all Kubernetes users in some shape or form and everyone should have a vested interest in how things shape up in the years to come. Until we have a clear winner, please continue using Argo CD with the tool of your choice!

--

--

Co-founder and CTO at Akuity | Argo Project Leader with 6-year tenure | ex-Intuit | ex-Applatix | Running the Enterprise Company for Argo — https://akuity.io