Automating Backstage TechDocs with Argo Events

DevOps Workflow for Backstage TechDocs using Argo

TJ. Podobnik, @dorkamotorka
Level Up Coding

--

In this blog post, we’ll delve into the automated deployment of TechDocs, Spotify’s in-house documentation solution, empowering engineers to seamlessly integrate documentation alongside their code. While there are various methods to accomplish this, I opted to explore the integration with Argo Workflows. We’ll discuss the rationale behind this decision and much more in details below.

The image above already gives you a slight taste of the architecture, let’s talk about Argo Workflows first.

Argo Workflows

To be honest, we could use any of the CI/CD options out there, ranging from GitLab integrated CI/CD, GitHub Actions, BitBucket Pipelines, but the decision for me to go with Argo Workflows is that they are Kubernetes Native and not bound to any VCS solution. Yes it’s true that once you choose the VCS you’re more than likely never change it, but the problem with VCS specific pipelines is that they are usually limited by the features provided by the VCS, while the Argo Wokrflows is surprinsgly flexible to your own will. Not only that, in case you have your code scattered across multiple VCS providers it’s most likely one of few solutions to uniformly solve your problems. But as I mentioned already, I also like it from the perspective that it’s transferable. So consider yourself working as a freelancer and at one project you work with BitBucket, while at the other company you work with GitLab. If you’ve managed to setup an Argo Workflow for one VCS it’s easily transferable to the other project which would not be the case with BitBucket pipelines for example.

⚠️ Note: I should note that there’s a bit of a learning curve involved, particularly for those new to Kubernetes. However, if you’re passionate about Kubernetes and enjoy diving into its intricacies, you’re likely to find this process quite rewarding.

Okay, so how can this help us with the Backstage TechDocs.

Backstage TechDocs

TechDocs is just a normal documentation (you can imagine) which in case of Backstage can reside near the code. The only difference compared to what most people are used to e.g. in case of README.md is that TechDocs also needs to be build and stored to your preferred storage to be loaded into Backstage later on. There are many ways how to achieve this, but what should be at least immediately obvious to the DevOps people is that this call for some automation. At the end of the day, you just want to write some docs and push to your remote repository and the rest should be taken care of on it’s own. This is where the Argo Workflow comes into play.

Solution

The solution is pretty straight forward:

  1. You edit your docs
  2. Push it to VCS repository
  3. Argo Event responds to a “push” action event and triggers Argo Workflow
  4. Workflow checkouts the docs repository, generates the final artifact
  5. Artifact is published to GCS Bucket
  6. Artifact is uploaded to the Backstage when requested
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
# NOTE: Check bug (WorkflowTemplate requires both generateName and name): https://github.com/kubernetes-sigs/kustomize/issues/641
name: backstage-docs-workflow
generateName: backstage-docs-workflow-
spec:
serviceAccountName: backstage-docs-workflow
# Artifact Garbage Collector
artifactGC:
strategy: OnWorkflowDeletion
entrypoint: prep-wf
templates:
- name: prep-wf
steps:
- - name: get-git
templateRef:
name: wftmpl-get-git
template: generate-git-artifact
arguments:
parameters:
- name: git-repository
value: <repository-with-docs>
- name: git-branch
value: <branch>
- - name: generate-backstage-docs
template: backstage-docs-gen-n-publish
arguments:
artifacts:
- name: docs-repo
from: "{{steps.get-git.outputs.artifacts.git-repo}}"

- name: backstage-docs-gen-n-publish
inputs:
artifacts:
- name: docs-repo
path: /src
script:
image: nikolaik/python-nodejs:latest
command: [bash]
source: |
cd /src &&
npm install -g @techdocs/cli &&
pip3 install "mkdocs-techdocs-core==1.3.5" &&
techdocs-cli generate --no-docker --source-dir "." --output-dir "./system-terraform-infra" &&
techdocs-cli publish --publisher-type googleGcs --storage-name backstage-docs --entity default/System/terraform-infra --directory "./system-terraform-infra

---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
labels:
argocd.argoproj.io/instance: workflow-templates
name: wftmpl-get-git
spec:
entrypoint: generate-git-artifact
templates:
- container:
image: 'golang:1.21'
inputs:
artifacts:
- git:
depth: 1
repo: '{{inputs.parameters.git-repository}}'
revision: '{{inputs.parameters.git-branch}}'
sshPrivateKeySecret:
key: ssh_key
name: workflows-ssh-key
name: argo-source
path: /src
parameters:
- name: git-repository
- name: git-branch
name: generate-git-artifact
outputs:
artifacts:
- name: git-repo
path: /src

This is only half of the setup because what we have right now is the workflow that does all the hard work of building and storing the documentation for us. However, what’s still missing is determining who and when to trigger it. It might not seem applicable to your use case, but as we discussed at the beginning, it would be advantageous if this process occurred with every new commit to the VCS repository. This is where Argo Events can assist us.

Argo Events

As per documentation:

Argo Events is an event-driven workflow automation framework for Kubernetes which helps you trigger K8s objects, Argo Workflows, Serverless workloads, etc. on events from a variety of sources like webhooks, S3, schedules, messaging queues, gcp pubsub, sns, sqs, etc.

In other words, it’s an automation tool that allows you to connect to various events, including commits to GitHub, product updates, LinkedIn posts, and essentially anything that can be programmatically detected as a change. There are multiple ways to utilize this functionality, so I’ll get straight to the point of my use case. We link our Argo Event to the push action on BitBucket Cloud, and once this event happens, we trigger the workflow we’ve defined earlier.

apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
name: default-eventbus
spec:
jetstream:
version: 2.8.1
replicas: 3
persistence: # optional
storageClassName: standard
accessMode: ReadWriteOnce
volumeSize: 10Gi
---
apiVersion: argoproj.io/v1alpha1
kind: EventSource
metadata:
name: bitbucket-eventsource
spec:
eventBusName: default-eventbus
service:
ports:
- port: 12000
targetPort: 12000
bitbucket:
# bitbucket eventsource example with basic auth strategy
example:
# Bitbucket repository list
repositories:
- owner: "example" # owner of the repository
repositorySlug: "example" # repository slug
# events to listen to
# Visit https://support.atlassian.com/bitbucket-cloud/docs/manage-webhooks/
events:
- repo:push
# Bitbucket will send webhook events to the following port and endpoint
webhook:
# endpoint to listen to events on
endpoint: /push
# port to run internal HTTP server on
port: "12000"
# HTTP request method to allow. In this case, only POST requests are accepted
method: POST
# url the event-source will use to register in Bitbucket.
# This url must be reachable from outside the cluster.
# The name for the service is in `<event-source-name>-eventsource-svc` format.
# You will need to create an Ingress or Openshift Route for the event-source service so that it can be reached from Bitbucket.
url: <your-url>
# Delete the webhook when the eventsource is deleted
deleteHookOnFinish: true
auth:
# basic refers to Basic Auth strategy and can be used with App passwords
# Visit https://support.atlassian.com/bitbucket-cloud/docs/app-passwords/
basic:
# username refers to K8s secret that stores the bitbucket username
username:
# Name of the K8s secret that contains the username
name: bitbucket-access
# Key within the K8s secret whose corresponding value (must be base64 encoded) is username
key: username
# password refers to K8s secret that stores the bitbucket password (including App passwords)
password:
# Name of the K8s secret that contains the password
name: bitbucket-access
# Key within the K8s secret whose corresponding value (must be base64 encoded) is password
key: password
---
apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
name: backstage-docs-sensor
spec:
dependencies:
- name: example-dep
eventName: example
eventSourceName: bitbucket-eventsource
eventBusName: default-eventbus
template:
serviceAccountName: argo-events-sensor-executor
# NOTE: Watchout - after "- template" below needs to be shifted !!!
triggers:
- template:
name: argo-workflow-trigger
argoWorkflow:
operation: submit
source:
resource:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: backstage-docs-workflow-
spec:
workflowTemplateRef:
name: backstage-docs-workflow

The code provided above is only a partial example, as it relies heavily on how you’ve configured your Argo Workflows and Events deployments. Nonetheless, it offers a good overview of the specific use case we’re discussing. I recommend starting with some simple examples outlined in the documentation and then attempting to replicate the setup we’ve described here. This approach will help you better understand and adapt the solution to your specific needs.

Conclusion

In conclusion, our exploration into automating TechDocs deployment with Argo Workflows presents a practical solution for seamlessly integrating Backstage documentation with code. By selecting Argo Workflows for its Kubernetes-native approach and flexibility across various version control systems, we’ve established an efficient process from doc editing to publication. Coupled with Argo Events, our workflow becomes event-driven, ensuring timely updates and effortless integration with version control systems.

To stay current with the latest cloud technologies, make sure to subscribe to my weekly newsletter, Cloud Chirp. 🚀

--

--