This project is meant to set up my base infrastructure for the web. In particular my Kubernetes cluster as well as a base set of software (CI/CD, git-server, etc.) and access-keys.

To achieve the goal of having a working base infrastructure for the web the setup is split into 2 dedicated steps:

Create static assets like machines for Kubernetes and access-keys via OpenTofu (or Terraform).
Install/Upgrade Kubernetes-cluster and other software via Ansible.

The infrastructure is deployed on Hetzner Cloud.

TL;DR

vim -o .envrc config.auto.tfvars # Get the contents from password-manager
direnv allow
tofu init
tofu apply
until ansible -m ping all; do sleep 10; done # Wait for the machines to start
ansible-galaxy install -r requirements.yml
ansible-playbook site.yml

Required software and packages

The setup will run on Debian, Ubuntu and macOS.

Make sure the following software is installed:

tofu or terraform (from package manager)
ansible (from package manager)
direnv (from package manager)
helm
helm-diff
python3-kubernetes (only on Debian/Ubuntu, from package manager)

Optional packages

These packages make maintenance easier.

k9s (from package manager)

Setup

Make sure .envrc and config.auto.tfvars are present. Then run direnv allow in the directory to apply the .envrc.

Since these files contain sensitive information they are stored outside of this project in my password-manager.

I’ve provided templates for both files: * .envrc * config.auto.tfvars are provided in the code.

Infrastructure

I use OpenTofu to provide the required infrastructure to run a Kubernetes-cluster.

The infrastructure is setup completely idempotent and can be safely re-applied.

tofu init (1)
tofu apply (2)
until ansible -m ping all; do sleep 10; done (3)

1	Initialize the Tofu modules if necessary
2	Setup infrastructure and create/update inventory.ini
3	Wait until all machines are fully started (This might take up to 5 minutes)

Software

I use Ansible to install and maintain the software of my cluster. This includes the Kubernetes cluster and the foundational services in it.

All Ansible playbooks are idempotent and can be safely re-run.

For the Kubernetes cluster I use k3s, simply because it’s very easy to maintain and still provides all common Kubernetes functionality.

The foundational services are:

cert-manager: This enables automatic issuance of TLS certificates. The certificates are issued via Let’s Encrypt and can be issued for the staging and production stage of Let’s Encrypt.
gitea: My personal favorite git-server.
concourse-ci: A powerful CI-service which I like to use to automate all kind of workloads.
snappass: A secure and reliable tool for sharing passwords.

TODO: Not setup yet!

The k3s-setup requires an inventory.ini which is automatically created by Tofu. So, make sure to apply the infra at least once, before running these playbooks.

ansible-galaxy install -r requirements.yml (1)
ansible-playbook site.yml (2)

1	Install required Ansible collections to create a k3s-cluster (can be omitted in subsequent runs)
2	Install k3s and download kube-config to `~/.kube/config`

The second step will override ~/.kube/config. Backup your existing config if you manage multiple clusters!

The affected scope of the Ansible-playbook can be limited with tags (--tags tag1,tag2):

Configured tags

The playbook has a couple of tags configured which restrict the execution to certain tasks.

init: Everything needed for the initial setup (same as omitting tags altogether)
add-server: Everything needed to add a new server to the cluster
add-agent: Everything needed to add a new agent to the cluster
update: Everything needed to update the cluster
config: Everything needed to update the local kube-config
k8s: Everything needed to provide the foundational services

app-specific tags

To allow to update specific services quickly you can use the following tags. However, these require a functional Kubernetes cluster first.

cert-manager: Apply changes to the cert-manager including support for Let’s Encrypt
gitea: Apply changes to gitea
concourse: Apply changes to concourse

Enlarge / Reduce size of cluster

Increase

Simply adjust the number of agents/servers in your infra/config.auto.tfvars.
Then run the Ansible-playbook of k3s again

Decrease

If you want to shrink the cluster DO NOT reduce the agent-amount directly! Instead proceed as the following:

Open k9s and go to :nodes
Select the agent with the highest numerical index and press r to drain it
Once that succeeded delete it with Ctrl-d
Finally reduce the amount of agents in Tofu and apply the change

Responsibilities

OpenTofu

Provide a network for the Kubernetes-cluster
- A public subnet exposed to the internet for the Kubernetes-servers
- A private subnet for the Kubernetes-agents
- Routing between subnets
Managing firewall rules to block everything from the servers except of:
- ping (protocol: icmp)
- Kubernetes API (Usually port 6443)
- ssh (I prefer to use a non-standard port (usually port 1022)
- public services, e.g. http and https (port 80 and 443) but also git-ssh (port 22)
Provisioning the machines for Kubernetes-servers in the public subnet
Provisioning the machines for Kubernetes-agents in the private subnet
Managing DNS-records

Ansible

Setup SSH-connections
Setting up routing on all servers
Installing k3s
Keep the software up-to-date
Add foundational services to the cluster

README.adoc Unescape Escape

Base Infra

TL;DR

Required software and packages

Optional packages

Setup

Infrastructure

Software

Configured tags

app-specific tags

Enlarge / Reduce size of cluster

Responsibilities

README.adoc