Base Infra
This project is meant to set up my base infrastructure for the web. In particular my Kubernetes cluster as well as a base set of software (CI/CD, git-server, etc.) and access-keys.
To achieve the goal of having a working base infrastructure for the web the setup is split into 2 dedicated steps:
-
Create static assets like machines for Kubernetes and access-keys via OpenTofu (or Terraform).
-
Install/Upgrade Kubernetes-cluster and other software via Ansible.
The infrastructure is deployed on Hetzner Cloud.
TL;DR
vim -o .envrc config.auto.tfvars # Get the contents from password-manager
direnv allow
tofu init
tofu apply
until ansible -m ping all; do sleep 10; done # Wait for the machines to start
ansible-galaxy install -r requirements.yml
ansible-playbook site.yml
Required software and packages
The setup will run on Debian, Ubuntu and macOS.
Make sure the following software is installed:
Optional packages
These packages make maintenance easier.
-
k9s(from package manager)
Setup
Make sure .envrc and config.auto.tfvars are present.
Then run direnv allow in the directory to apply the .envrc.
Since these files contain sensitive information they are stored outside of this project in my password-manager.
I’ve provided templates for both files:
* .envrc
* config.auto.tfvars are provided in the code.
|
Infrastructure
I use OpenTofu to provide the required infrastructure to run a Kubernetes-cluster.
| The infrastructure is setup completely idempotent and can be safely re-applied. |
tofu init (1)
tofu apply (2)
until ansible -m ping all; do sleep 10; done (3)
| 1 | Initialize the Tofu modules if necessary |
| 2 | Setup infrastructure and create/update inventory.ini |
| 3 | Wait until all machines are fully started (This might take up to 5 minutes) |
Software
I use Ansible to install and maintain the software of my cluster. This includes the Kubernetes cluster and the foundational services in it.
| All Ansible playbooks are idempotent and can be safely re-run. |
For the Kubernetes cluster I use k3s, simply because it’s very easy to maintain and still provides all common Kubernetes functionality.
The foundational services are:
- cert-manager
-
This enables automatic issuance of TLS certificates. The certificates are issued via Let’s Encrypt and can be issued for the staging and production stage of Let’s Encrypt.
- gitea
-
My personal favorite git-server.
- concourse-ci
-
A powerful CI-service which I like to use to automate all kind of workloads.
- snappass
-
A secure and reliable tool for sharing passwords.
TODO: Not setup yet!
The k3s-setup requires an inventory.ini which is automatically created by Tofu.
So, make sure to apply the infra at least once, before running these playbooks.
|
ansible-galaxy install -r requirements.yml (1)
ansible-playbook site.yml (2)
| 1 | Install required Ansible collections to create a k3s-cluster (can be omitted in subsequent runs) |
| 2 | Install k3s and download kube-config to ~/.kube/config |
The second step will override ~/.kube/config.
Backup your existing config if you manage multiple clusters!
|
The affected scope of the Ansible-playbook can be limited with tags (--tags tag1,tag2):
|
Configured tags
The playbook has a couple of tags configured which restrict the execution to certain tasks.
- init
-
Everything needed for the initial setup (same as omitting tags altogether)
- add-server
-
Everything needed to add a new server to the cluster
- add-agent
-
Everything needed to add a new agent to the cluster
- update
-
Everything needed to update the cluster
- config
-
Everything needed to update the local kube-config
- k8s
-
Everything needed to provide the foundational services
app-specific tags
To allow to update specific services quickly you can use the following tags. However, these require a functional Kubernetes cluster first.
- cert-manager
-
Apply changes to the cert-manager including support for
Let’s Encrypt - gitea
-
Apply changes to gitea
- concourse
-
Apply changes to concourse
Enlarge / Reduce size of cluster
- Increase
-
Simply adjust the number of agents/servers in your
infra/config.auto.tfvars. -
Then run the Ansible-playbook of k3s again
- Decrease
If you want to shrink the cluster DO NOT reduce the agent-amount directly! Instead proceed as the following:
-
Open k9s and go to
:nodes -
Select the agent with the highest numerical index and press
rto drain it -
Once that succeeded delete it with
Ctrl-d -
Finally reduce the amount of agents in Tofu and apply the change
Responsibilities
- OpenTofu
-
-
Provide a network for the Kubernetes-cluster
-
A public subnet exposed to the internet for the Kubernetes-servers
-
A private subnet for the Kubernetes-agents
-
Routing between subnets
-
-
Managing firewall rules to block everything from the servers except of:
-
ping (protocol:
icmp) -
Kubernetes API (Usually port
6443) -
ssh (I prefer to use a non-standard port (usually port
1022) -
public services, e.g. http and https (port
80and443) but also git-ssh (port22)
-
-
Provisioning the machines for Kubernetes-servers in the public subnet
-
Provisioning the machines for Kubernetes-agents in the private subnet
-
Managing DNS-records
-
- Ansible
-
-
Setup SSH-connections
-
Setting up routing on all servers
-
Installing k3s
-
Keep the software up-to-date
-
Add foundational services to the cluster
-