67 Commits

Author SHA1 Message Date
ed656189ea Replace deprecated datacenter by location for priamry-ips
Hetzner has changed it's API and removed the field `datacenter` from the
primary IPs in favor of `location`. This change reflex this and adjusts
the configuration accordingly. Note, that this change didn't require any
manual state changes. Instead I applied the former plan with the newest
provider once. Hence the provider already treated the fields correctly I
only had to adjust the configuration.
Chapeau Hetzner for this good transition!

See-also: 14da745f Update tofu-resources to their latest versions
Reference: https://docs.hetzner.cloud/changelog#2025-12-16-phasing-out-datacenters
2026-02-13 00:33:47 +01:00
14da745fcc Update tofu-resources to their latest versions
The API of Hetzner has got some important changes recently which will
impact my configuration. So, this maintenance change is necessary for me
to address these changes and figure out all deprecations.

First and foremost a new DNS-API was introduced in November 2025 to bind
the DNS-settings closer to their cloud console. In favor of this new
DNS-system they will phase out the old API in the beginning of May 2026!

Secondly, some API-fields have changed, e.g. the "datacenter" field of
primary IPs is going to be removed in favor of the "location" field.
This change will finally take place at 1. of July 2026.

Beside of that I simply updated all providers to their latest versions.

Reference: https://docs.hetzner.com/networking/dns/faq/beta
Reference: https://docs.hetzner.cloud/changelog#2025-11
Reference: https://docs.hetzner.cloud/changelog#2025-12
2026-02-13 00:01:02 +01:00
1f69c1578c Add longhorn distributed storage to the k3s-cluster
This change adds longhorn, an addition to Kubernetes that adds the
ability to use distributed storage over all nodes to the cluster.

Note, that I tried that in December already but due to very high load on
the machines I rolled _everything_ back. Though, it turned out that the
high load was not because of longhorn, but instead because of bad
configuration of the server, as described in the see-also commit.

Reference: https://longhorn.io/
Reference: https://longhorn.io/docs/1.10.1/deploy/install/install-with-helm/
See-also: 4b8a3d12c4 Use etcd instead of sqlite for k3s-server
2026-01-23 00:45:00 +01:00
4b8a3d12c4 Use etcd instead of sqlite for k3s-server
Oh damn, that was so annoying. My cluster ran on near 100% load all the
time! As it turns out that's a known issue on k3s clusters.

The solution is to add the `--cluster-init` flag to the server which let
the server use etcd instead of sqlite. And voila the cpu-usage drops to
a resonably low level in the single digit percent range.

Reference: https://github.com/k3s-io/k3s/issues/10396
Reference: https://docs.k3s.io/datastore/ha-embedded#existing-single-node-clusters
2026-01-22 22:47:58 +01:00
4ac786c5c5 Update gitea chart due to discontinued bitnami db-images
Bitnami has discontinued a lot of their container images. Alongside
these were also their images for a high availability setup of
postgresql. This change fixes that by referencing the legacy bitnami
images until a "new" approach is found.
2025-12-14 20:29:32 +01:00
a3e67f9fce Disable public registriation to my gitea server
My gitea-server is basically my safe harbor for private git-projects. It
is not meant to be public.

Even more important that would shift responsibilities a lot, especially
legal liabilities may become important suddenly, when the server is
open.

Furthermore I can't guarantee a process availability when I cannot make
any assumptions about the usage. And, I cannot make such assumptions for
an open and public project which I maintain in my spare-time.
2025-12-12 20:26:49 +01:00
bdf20799ce Move the toc in README below the preamble for better readability 2025-11-28 22:30:25 +01:00
3f40c424fa Move the table of contents below mirror hint on Github in README 2025-11-28 22:25:35 +01:00
32383b5365 Add table of contents to README
The README for this project has grown a lot, so it makes sense to
include a table of contents to regain some control.
2025-11-28 22:21:40 +01:00
8923280d4c Remove note from README that snappass is not ready 2025-11-28 22:14:05 +01:00
20b0ac86f5 Add snappass to the cluster
This change is surprisingly tricky and needed some temporary
workarounds. First, there is no "official" snappass helm chart but I
found one, which does the job and looked good enough. The other problem
is the missing "official" image of snappass. The helm-chart used a
customized image which I didn't want to use, therefore I had to rebuild
a brand new image quickly. This new image is unfortunately not bound to
any repository or pipeline yet, which means that this change needs some
trust for the moment until I've set up the needed repo and CI
structures.

Reference: https://github.com/lmacka/helm-snappass/tree/main
Reference: https://github.com/pinterest/snappass
2025-11-28 22:12:47 +01:00
f562241b5c Remove dangling text-fragment from README 2025-11-28 15:37:40 +01:00
6cef6bf868 Fix formatting of templates-tip in README 2025-11-28 15:36:23 +01:00
8afffdb2af Add emojies as admonition captions for github in README 2025-11-28 15:30:31 +01:00
7c928ac8e3 Add note that the github-repo is only a mirror to the README 2025-11-28 14:30:18 +01:00
6824bd7802 Combine the sections about required software in the README
For stupidity reasons I had split up the "Supported Platforms" and the
"Required Software" without realising that these are actually entangled.
This change fixes that.
2025-11-28 00:28:25 +01:00
cc0e00f1af Another massive rewrite of the README
This change actually alters the readme significantly. The overall goal
was to adjust it to an easier to read document, since the previous
version had generally outgrown its initial layout. This alone should
raise a flag since it could indicate a too long document. But, I want to
make sure to understand each detail even after some time off.

This new approach is targeting this desire, and improves the overall
structure to read the document from top to bottom, as I like it.
2025-11-28 00:28:25 +01:00
70462e1795 Mention the usage of Hetzner Cloud in the README 2025-11-28 00:28:25 +01:00
94d5cc60c0 Enhance the README a lot
This change is huge, therefore I only sum up the most important changes:
* Improve spelling
* Reduce ambiguity
* Use OpenTofu instead of Terraform
* Document missing tags for Ansible
* Provide example-configuration
* Fix confusion between dotenv and direnv, I use direnv!
* Add section about required software
* Many spelling mistakes
2025-11-28 00:28:25 +01:00
91f81b8726 Add concourse as the foundational CI tool to k8s-cluster
This change allows to add a concourse-server to the kubernetes cluster.
2025-11-28 00:28:25 +01:00
0eaa5d3b08 Add current IP automatically to whitelists for SSH and Kubernetes
After I removed the automatic IP addition to the firewalls for SSH and
Kubernetes I ran into a problem only a few days later. My ISP changed
my IPs and I was to stupid to realize that immediately. So, this change
reintroduces the automatic addition of my current IPs to the whitelists
for Kubernetes and SSH. Though, I adjusted the algorithm, so it will not
change every day or so, but instead really only when my ISP changes my
IPs.
2025-11-28 00:28:25 +01:00
adfa2674c6 Migrate state to base-infra/terraform.tfstate in b2-bucket
I renamed the project from "hetzner-infra" to "base-infra", since that
better fits the purpose of this repository. So, this change migrates the
state name accordingly, to avoid confusion.
2025-11-28 00:28:25 +01:00
e22217f2ed Adjust formatting according to tf standards 2025-11-28 00:28:25 +01:00
9db5f749d3 Remove TODO to setup gitea, since it's already done 2025-11-28 00:28:25 +01:00
cb7d2712ff Fix typos in readme 2025-11-28 00:28:25 +01:00
5b97e5268d Remove plan to setup minio since I moved over to Backblaze 2025-11-28 00:28:25 +01:00
38bfc493b5 Add mandatory .envrc setup-instruction to readme 2025-11-28 00:28:25 +01:00
0cd390e9e5 Simplify abstract of README to better describe the purpose 2025-11-28 00:28:25 +01:00
f43ea3d324 Update readme to emphasize the focus on the base web infrastructure
I plan to move over more base tasks to this repository, like maintaining
the keys for Backblaze. Therefore I adjusted the readme accordingly.
Furthermore I fixed the spelling on sever places.
2025-11-28 00:28:25 +01:00
b33da3eca0 Simplify server and agent definition for kubernetes
The definition was split into multiple settings, that made it
unnecessary complicated to setup the definition for my kubernetes
cluster. This new approach allows for granular definitions of servers
and agents and is also simpler to use for me.
2025-11-28 00:28:25 +01:00
58b0c0fcc7 Move declaration of primary IPs into kubernetes-module 2025-11-28 00:28:25 +01:00
cb97668b63 Define IPs which have access to the kubernetes-API and SSH as variables
I liked the idea to have these IPs dynamically detected at runtime,
though some research showed that my current provider only renews these
every 180 days, nowadays. So, no need for such a hyper-dynamic solution.
Instead I use a variable now, which brings some other benefits, like
adding arbitrary IPs as well. This might become handy in cases of CI/CD.
2025-11-28 00:28:25 +01:00
6ca0a07522 Configure dns-zones via variables, instead of as static values
This change makes it a bit easier for me to manage specific domains.
Note, that in the long-run these settings should _not_ belong to this
repository. Instead I'm going to maintain these in projects where the
domain is more meaningful.
2025-11-28 00:28:25 +01:00
4f9ea90f8e Add gitea as git-server to k8s-cluster
By applying this change the kubernetes cluster gets a gitea-server
setup. Note, that I use a custom-image which I have to automate in
future. The customization is necessary since I use asciidoc very often
and the default-gitea doesn't render these files, so it becomes a bit
cumbersome to read them on the web.
2025-11-28 00:28:25 +01:00
b16566e021 Move tasks to setup cert-manager into its own task-file
This change is the first step to setup further tools, like a git-server
or CI-servers with this role.
2025-11-28 00:28:25 +01:00
9d32790c99 Move terraform-state to b2
The terraform-state can be stored in backblaze b2 with some
configurations. This changes does exactly this. Note, that this requires
the special env-variables `AWS_SECRET_ACCESS_KEY` and
`AWS_ACCESS_KEY_ID` which are normally part of the AWS-setup. To be able
to use AWS and this setup in parallel I use dotenv to maintain the
variables in the special file `.envrc`.

Reference: https://andrzejgor.ski/posts/backblaze_b2_tf_state
Reference: https://www.reddit.com/r/selfhosted/comments/1iv1qir
Reference: https://direnv.net/
2025-11-28 00:28:25 +01:00
18a5d1eae2 Switch from terraform to opentofu, so update some providers therefore 2025-11-28 00:28:25 +01:00
af72ec5cf9 Don't gather facts just to run k8s-setup since it's not needed 2025-11-28 00:28:25 +01:00
f19a1f61c9 Use port 1022 for all cluster nodes as SSH-port and fix some config-errors
I'm oversaw completely, that I have to change the SSH-port for all nodes
in the cluster otherwise I cannot provide a meaningful load-balancer for
the git-ssh port in it.

Additionally this allowed me to fix some config errors which I simply
oversaw.
2025-11-28 00:28:22 +01:00
f1856f59aa Fix tags to limit even the reference to roles/playbooks
The previous setting of tags still let ansible gather facts for the
roles in question, even though they're not executed. This fix prevent
this from happening.
2025-11-28 00:24:18 +01:00
af5feca667 Document possible tags for the ansible-playbook
The playbook itself is written to be idempotent, so it doesn't hurt to
run all tasks many times. But, it doesn't need to run all tasks all the
time, therefore you can limit the executional-scope with the documented
tags to only affect certain tasks. This improves the performance a lot!
2025-11-28 00:24:18 +01:00
7297892e18 Merge infra and k3 into one directory again
Since I don't have multiple terraform steps anymore it simply doesn't
make sense to me anymore to split all tasks into separate folders.
Instead I try to be as clear as possible in the README to make it easy
to follow the structure in the future without too much headache.
2025-11-28 00:24:18 +01:00
fef383fed4 Move setup of foundational service from k8s to k3s
It simply doesn't make sense to split the installation of the
kubernetes-cluster from the provisioning of foundational services.
Therefore I drop the idea to organise these services in another
terraform-setup and instead ensure their presence with ansible, as it's
already responsible for setting up the cluster and keep it up-to.date.
2025-11-28 00:23:36 +01:00
adec38e1cd Make ssh-port of servers initially configurable
It looks somehow random that the SSH-port was simply defined in the
configuration of the k3s-setup. It looks somehow "configurable" although
it isn't. Therefore I moved this setting to the correct place in the
terraform-setup.

An important side-note is that this change doesn't make it possible to
_change_ the ssh-port, though. Once decided for an port and I have to
stick to it until I tear down the cluster!
2025-09-19 18:03:04 +02:00
9c19a21273 Simplify configuration by moving all the vars into config.ini
The navigation through a bunch of config files, all with just a few
lines in it is cumbersome. This change moves all the configuration into
a centralized `config.ini` that way it's easier for me to get a quick
overview of the setup. The `config.ini` acts as another inventory and is
therefore references as such by the ansible.cfg. The `inventory.ini`
(which is generated by terraform in the provisioning-step) is not
affected by this change.
2025-09-19 16:02:27 +02:00
95cc115734 Move download of kube-config into dedicated role 2025-09-19 14:14:25 +02:00
d227c954a6 Rename main.yml to site.yml to match docs and follow common practices 2025-09-18 20:41:26 +02:00
4beb9e2844 Move configuration of servers completely to ansible
With this change we no longer use user-data scripts on the provided
machines. That makes it way easier for me to handle all the
configuration, since I only have to run ansible. Furthermore this the
burdon to think what may went wrong, since ansible is easier to debug
than some arbitrary scripts which run at provisioning-time on the
machines.

With this change I should also think about restructuring the code a bit
as well. Since it's actually easier to provide the initial
software-stack for the cluster via ansible than via terraform, at least
as far as I can tell right now.
2025-09-18 20:41:26 +02:00
fda7cac5c0 Only make ssh-port free on k8s-servers since the agents doesn't need to
The only reason I even change the port is to make sure a git-client can
reach the my upcoming git-servers on the standard ssh-port. Though to
achive this I only have to make sure that the port is reacheable on the
internet, after that the port is routed through the kubernetes network.
This means that my agents can keep using the standard-port, which makes
everything easier for me :)
2025-09-18 16:42:21 +02:00
4a818d0c8a Add a short tl;dr section to the readme for quick setup 2025-09-18 16:00:57 +02:00