The good, the bad and the ugly of templating YAML in Kubernetes

Alexander Block
Level Up Coding
Published in
16 min readNov 15, 2023

--

In this blog post, I’d like to argue that templated YAML has a bad reputation in the Kubernetes community for the wrong reasons, and that it actually is not as evil as one might believe, even with the bad experiences that all of us have likely made in the past.

I believe that the Kubernetes community has at one point made a wrong turn, and that the consequences have had an enormous ripple effect that is still influencing everything in the Kubernetes world.

My argument is, that if we’d go back a few steps and reconsider what went wrong, we might realise that templating YAML could have been a very nice experience actually…without all the pain involved..

I am the developer and maintainer of the Open Source project Kluctl (Github) and I believe that templating with Kluctl is actually a very pleasant experience. This is however something you’d have to figure out by yourself, either by trying it out on your own or by watching this live demo at the Rawkode Academy YouTube channel.

This blog post will not concentrate too much on Kluctl but instead on templating YAML in general.

The good side of templating

Lets first start with the good side of templating. I believe that templating YAML is a very efficient and very explicit way of manifesting your intent:

Make this little part of configuration that I’m looking at right now dynamically configurable.

A simple example is the name and/or namespace of an object, which might need to by dynamic if I plan to deploy that object multiple times. Or the replicas of a Deployment , which maybe need to be higher on production environments than on test environments. With templating, the important part is right where it belongs to, in the affected manifest itself.

Templating is also best, when used as little as possible. Less is better applies here the same as in many other fields. This means, that the tooling that implements the templating solution should offer functionality that allows you to avoid templating as much as possible. This can be achieved by proper project structures, hierarchies and formats, or by allowing other means of solving the same objective without templating.

Another positive point on the side of templating is that non-developers are able to learn and apply it in practice without the need to learn a full-blown programming language.

The wrong turn

The wrong turn that I mentioned was the use of Go templates in Helm while Helm itself offered no means to reduce the amount of necessary templating, combined with established practices that encouraged the overuse of templating. Helm later grew into the de-facto standard for the distribution of Kubernetes deployments/applications/charts and thus shaped the general perception of “how painful templating YAML feels like”.

From the perspective of the initial developers, it perfectly made sense why Go templates were chosen for Helm, it literally was and still is the standard in Go, as it is actually part of the Go standard library. Helm is written in Go, so what else to choose but this obvious choice?

However, I strongly believe that Go templates are no good fit for “regular” Kubernetes users, meaning people who “just want to deploy to Kubernetes”. For many of them (if not most?), Go templates do not feel natural at first and it takes a lot of time and effort to get used to them. This would be fine if it’d pay off after some time but it simply doesn’t.

I’ll try to give a few examples in the next chapters, but this is not meant to be complete at all. There are also other issues with Helm itself, that I’ll try cover as well.

Logical/Arithmetical operations without operators

Go templates are very different compared to what people knew until this point. The most obvious difference to any other templating language is the lack of infix operators. Instead, Go templates only support functions, which you’re forced to use instead. This leads to the most unreadable, most unmaintainable “expressions” I have ever seen. A simple {{ if a and b }}becomes {{ if and a b }} and a slightly more complex {{ if a and(b orc) }} becomes {{ if and a (or b c) }}. Combine that with deeply nested variables (e.g..Values.prometheus.ingress.enabled), and you get what you typically see in many Helm templates.

Even Bash’s if/fi blocks are more readable in that regard. And I hate Bash.

The same issue also applies to accessing items of lists or dicts, which is also only possible via functions. This means, a simple to read {{ $myList[0] }} becomes {{ index $myList 0 }}. Combine this with some nesting, and the fun begins: {{ index $myMap (index $myList 0) }}.

Whitespaces

Another example is the excessive use of whitespace control in Helm templates.

...
{{- with .Values.startupapicheck.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.startupapicheck.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.startupapicheck.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
...

This is a topic that I don’t fully understand, in the sense of why it is done so excessively. Technically, YAML is still valid, even if you have a lot of unnecessary whitespace in-between. It doesn’t make a difference in most cases, so there is no real need to avoid them. It only becomes an issue when dealing with concatenated or multiline text. Indention is another story and I’ll cover this shortly in the next chapter.

I can only come up with two possible explanations for this, but I’m happy if someone could help me understand this better (I’ll update this post if necessary).

First potential explanation is that the Helm documentation explicitly mentions that “YAML ascribes meaning to whitespace” while giving examples of YAML with extra whitespaces that actually have no meaning at all (e.g. empty lines with unnecessary spaces…YAML doesn’t care about these). The best practices in the Helm documentation also suggest to avoid extra whitespaces, without giving any explanation why this is best practice.

Another explanation I can offer is the desire to have “nice looking” output when using helm template, this is something I have actually heard from Chart authors as a reason. I can however not imagine why this would be of any importance…if someone wants nice looking YAML, just run it through yq and let it pretty-print it. Also, the Kubernetes API really doesn’t care how the YAML looks, it actually will never sees your YAML due to kubectl’s (or any other go-client/controller-runtime based tool) internal conversions to JSON or even gRPC/protobuf.

Of course, yq won’t work if you’re using helm template to debug why your templates result in invalid YAML. This is however not the default situation, so optimising for this rare situation is something that I don’t believe is worth it.

Also, I strongly believe that most bugs that people need to debug via helm template would not even happen with other template languages or when the other Helm issues I try to describe in this blog post would not exist.

Go template contexts

Go templates always have a current context, accessible via a dot, e.g. {{ .Some.Member }} . You can set the current context to something different via {{ with .Values.something }}{{ .MemberOfSomething }}{{ end }}. Setting the context however means that the old context is not available anymore, with the only exception of {{ $.Some.Member }} always pointing to the initial context.

This concept feels very strange to me, and I assume others feel the same. All other template and programming languages have the concept of “scopes”, meaning that you have a global scope and multiple layered local scopes for variables/functions. You only lose access to another variable if something with the same name overrides it.

For an example of this, check the whitespace example from above, which extensively uses with context without an apparent reason. This is a pattern that is repeated in many Charts, leading to a lot of additional and unnecessary templating boilerplate.

So much templating, so little YAML

Have you ever looked into popular Charts and wondered why in some of these every second line is Go templating? In other words, why the ratio of templating per actual lines of YAML is so high?

Just look at the Kafka Chart by Bitnami, and you’ll find manifests like in the above screenshot.

The reason is that a good Helm Chart must take into account all possible variations in which Chart consumers might want to use the Chart. It must take into account that people want custom labels, annotations, nodeSelectors, tolerations, resources, command args, and so on. A Chart author gets forced into overusing templating, because otherwise people won’t be able to use the Chart in their specific use-case.

In other words: To become a good and successful Chart, a chart MUST become a bad Chart!

Helm could have avoided this completely by allowing Chart consumers to natively patch resources in a convenient way. A simple patches.yaml (containing a set of json or merge patches, as known from Kustomize) along the values.yaml that you provide at helm install time would have been sufficient to alleviate a lot of the pain that Chart authors and maintainers suffered.

Instead, consumers are forced to use third party tools for patching, either as frontends to Helm (e.g. Kustomize, Helmfile or Kluctl) or as a post-renderer when installing the Chart. This in turn means that the Chart author is practically forbidden to make any assumptions on how Helm is used by the consumer, forcing him to still template as much as possible, leading to many unmaintainable and hard to read Charts.

YAML is a superset of JSON

Many Chart authors actually know this, but for some reason this fact is not leveraged when writing Helm templates. Just consider the snippet from the “Whitespaces” section and compare it to this modified snippet:

...
nodeSelector: {{ toJson .Values.startupapicheck.nodeSelector }}
affinity: {{ toJson .Values.startupapicheck.affinity }}
tolerations: {{ toJson .Values.startupapicheck.tolerations }}
...

This form allows to completely avoid any issues with whitespaces and indention. This works with lists, dicts, objects and plain values. It’s much shorter and the intent is immediately clear, as you don’t have to jump back and forth while reading it.

The only situation where this does not work is when you need to mix hard-coded dict or list entries with entries from template values.

Helm “templates/” is flat

In Helm Charts, there is no concept of a project hierarchy. What I mean is that all templates are located in the templates/ subfolder, without the possibility to group or structure these in any way. Yes, you can have subfolders in templates/, but these are flattened when rendered as if they were all located in the same folder.

This means that everything is global and flat. You can’t have different sets of variables/context in different folders. You can’t disable whole groups of resources. Instead, you’re forced to wrap all files with copy-pasted if blocks. Look at the Cilium Chart as an example of this.

I strongly believe that Helm could have done better in that regard, completely independent from the templating issues I listed so far.

Helm values.yaml are not templated

When rendering templates, Helm requires all values to be loaded and ready-for-use internally. This leads to a chicken-and-egg problem in regard to templated values.yaml files: You can’t render values.yaml because you don’t have values loaded.

Helm could have solved this by allowing layered loading of values.yaml files, meaning that one file is loaded after each other while making all previously loaded values available in the currently loaded file. This would allow complex templating in values.yaml , making life a lot easier when it comes to handling of default values in different scenarios. It would eliminate a lot of boilerplate templating that currently has to be repeated over and over in many places.

Helm supports the tpl function to somewhat allow templating via values.yaml , but it’s in no way a full replacement. It requires the Chart author to decide which values need to be templated, so that he uses tpl in the templates where these are potentially going to be used. It’s however still impossible to use control structures in values.yaml .

Helm is a package manager…

… as clearly stated on the front page of https://helm.sh/. That’s what it was built for and that’s what it is best at.

But people started to use it for different purposes and use-cases and then wondered how painful it is to work with it. The pain that an author of a publicly distributed Chart goes through will however pay off after time, because broad adoption is what the author gets in exchange.

If your use-case however does not involve public distribution, and broad adoption is not your goal, it does not pay off to write a Helm Chart. Especially if it follows all best practices, allows all kinds of configuration, is distributed via a (hopefully secured) Helm Repository or OCI registry, and so on…

In other words, Helm is not a good fit if you want to manage your own deployments, which are usually quite static compared to what a typical public Helm Chart is. You usually don’t care about some external entity wanting to add arbitrary annotations to your Service and and you also usually don’t want every Deployment to be get arbitrary command args or environment variables from some locally unknown source. In the end, you’re only interested in making the things dynamic that change between your environments/clusters.

So, templating YAML is bad, right?

I believe that many people have burned their hands at this point and unconsciously connected templating YAML with pain, with hard to debug bugs and errors, with verbose and unreadable manifests, with a lot of effort and boilerplate…and more pain, a lot of it.

I also believe that simply using another templating language and more advanced features in the tooling that leverage this templating language can make templating a lot more fun and enjoyable again.

In my opinion, Jinja2 as a templating engine is a very good candidate here. It’s currently hard to integrate into Go based tools, but it’s possible.

In regard to tooling, I believe that Kluctl is a good example of how it can be done better. It does not simply render template files in a flat manner, but instead introduces a flexible project structure/hierarchy with flexible and layered variable sources to introduce scoped configuration/values.

Generic Helm Charts

Very often, when people ask online about how to deal with the pain Helm brings, generic Helm Charts are being proposed as a solution. These are Helm Charts that abstract away the complexity of the involved Kubernetes resources, e.g. by bringing pre-configured Deployment andService with a lot of templating to make them as customisable as possible, at least when it comes to publicly available Charts. People also develop their own versions of these common Charts for internal use only, these are able to use much less templating of course.

In my opinion, at least the public versions of these Charts, are more like a manifestation of the underlying problems than a real solution. It helps people to not deal with Helm while still using Helm. I perfectly understand the motivation behind those, but it’s sad that they are even needed.

Kustomize and the promise of template-free YAML

When I first read about Kustomize, I was a quite hyped, especially because at that time it also got included into the kubectl CLI. I had a lot of hope that a new standard is getting established, making all external pre-processing (including templating via Helm) unnecessary due tokubectl having all batteries included from now on.

I started to build my first deployments that targeted multiple environments and clusters. At first, it was all fun until I realised Kustomize was not able to easily allow simple things like “only deploy a Prometheus ServiceMonitor when Prometheus is available on the target cluster” . Something that I could easily solve via some inline-templating in the past, now required me to artificially restructure my project and pull apart things into different overlays that clearly belonged together. I was forced to spin my head in ways that did not feel natural at all for me.

I managed to come up with a working solution, but when the next step required me to introduce preview environments, at a time when Kustomize had no vars, replacements or KRM support, I ditched everything and started working on a Python+Jinja2 based script that evolved into a full blown deployment orchestrator and then later became Kluctl (v2.0.0 was a complete rewrite in Go).

In my opinion, Kustomize tries too hard to fulfil its promise of being completely template-free. It leads to solutions that feel unnecessarily complex and artificial, especially if you consider that very often a simple one-liner of Jinja2 (or even Go template) could solve the same requirement in a much simpler way. There were times where some simple forms of templating were possible, via the now deprecated vars feature. This is now superseded by replacements, which are more like patches which can also perform replacements inside values.

This is probably a very nice feature if you need to Kustomize manifests that are out of your own control…but usually you’d use Kustomize for your own manifests and in that case templating would be the much easier and straight forward way to manifest your intention into code.

I find it interesting that at the same time, the Flux project (one of the leading GitOps solutions) decided to introduce “Post build variable substitution” for Kustomizations, basically circumventing one of the main reasons Kustomize was brought to live and why it ultimately decided to deprecate the vars feature. From what I observe, Flux users are very actively using this feature, making it very clear how much demand there is for at least some form of templating.

Cue, Jsonnet, Pulumi, cdk8s, …

There are many voices in the community that advocate for the use of non-template based solutions. Currently, the most prominent solution that keeps coming up is Cue. Jsonnet is another alternative that however gets less and less attention.

Discussions around Cue very often go the same direction. People consider the idea of it to be interesting and refreshing, but hesitate to try it out because the language itself “feels alienating” and/or because they wait for someone else to implement it properly.

Regarding tooling that support/use Cue, Timoni seems to be the most promising at the moment. It’s developed by a prominent Flux developer and thus already got some traction.

My position on these efforts is still very critical. I fear that the community will run into a trap and end up being in a situation that is as bad is it is today, just different. I believe that the “regular” Kubernetes users that I mentioned in the very beginning will have as many issues with Cue as they had with Go templates. They will now be forced to learn a new configuration and programming language, not just a templating language that integrates into well known and established YAML.

With YAML + templating, all existing documentation and examples in the Kubernetes ecosystem can be taken literally, to the point where copy+paste+customisation of examples into your own manifests works just perfect. Using templating in-between is a no-brainer, as you just insert it where its needed. With Cue, you can’t do that, because Cue is not a superset of YAML, meaning that it must be converted into Cue first. The same applies to all existing YAML manifests provided by Kubernetes native applications, everything needs to be converted (and maintained long-term!) into Cue by someone first.

I strongly believe that this will become very cumbersome for people and Reddit will be full of people asking if they are really on the right track, same as it happens with Helm all the time.

I put Pulumi, cdk8s and comparable solutions into the same corner to be honest. Writing code instead of YAML will break people, because it is done for the wrong reasons. Just imagine the friction you’ll get because now people will not just have to learn a templating/configuration language, but a full blown programming language just because someone else decided to use a language that you don’t speak fluently yet. Or you decide to use a language that your colleagues don’t speak yet…so much potential for chaos and pain.

Here is an opinion video from Michael Crilly that describes this very good (even though it’s not about templating).

Strictly typed configuration?

One argument for the use of non-YAML based languages is that these offer strong/strict typing. In my humble opinion, this is overvalued. Kubernetes manifests are by nature already bound to strict schemas, as defined by the Kubernetes APIs and Custom Resource Definitions.

This means, that passing a boolean by accident (check the “Norway problem”), when a string is expected, will not be accepted by the Kubernetes API anyway. Missing fields, typos in field names, and so on, will also be caught by the API.

The real problem here is that such bugs are catched too late in the tooling and established workflows, meaning that the underlying schema definitions are not taken into account early enough. For example, convenient dry-running your deployments to different target environments, without being forced to push to Git or release a Chart, would allow to catch most (if not all) of these bugs.

Of course, one thing that will stay hard to fix with templating involved is the integration into IDEs with syntax highlighting that shows type related issues on-the-fly. However, this issue wouldn’t be that bad if templating wouldn’t be so overused (which Chart authors are forced to do, as mentioned before).

YAML itself is already bad

There are many articles (e.g., this one) available online that list quite a few issues with YAML, for example the “Norway problem” mentioned earlier.

However, from my personal experience, this is really not an issue in daily business. And if it is an issue, it’s usually caught by the Kubernetes API.

If one still wants to argue that YAML is evil, then lets fix YAML and not introduce completely new languages that are completely foreign to everyone.

Summing it up

As mentioned multiple times now, I believe templating YAML does not have to be that bad. Let’s go back a few steps and then give it another chance, with a proper templating language that is built for humans (e.g. Jinja2) and with tooling that allows to avoid as much templating as possible.

Thanks for reading this post, and I hope it can actually help the community in some way. And sorry if I turned to ranting from time to time :)

--

--