base-infra

Author	SHA1	Message	Date
Felix Nehrke	ed656189ea	Replace deprecated `datacenter` by `location` for priamry-ips Hetzner has changed it's API and removed the field `datacenter` from the primary IPs in favor of `location`. This change reflex this and adjusts the configuration accordingly. Note, that this change didn't require any manual state changes. Instead I applied the former plan with the newest provider once. Hence the provider already treated the fields correctly I only had to adjust the configuration. Chapeau Hetzner for this good transition! See-also: `14da745f` Update tofu-resources to their latest versions Reference: https://docs.hetzner.cloud/changelog#2025-12-16-phasing-out-datacenters	2026-02-13 00:33:47 +01:00
Felix Nehrke	14da745fcc	Update tofu-resources to their latest versions The API of Hetzner has got some important changes recently which will impact my configuration. So, this maintenance change is necessary for me to address these changes and figure out all deprecations. First and foremost a new DNS-API was introduced in November 2025 to bind the DNS-settings closer to their cloud console. In favor of this new DNS-system they will phase out the old API in the beginning of May 2026! Secondly, some API-fields have changed, e.g. the "datacenter" field of primary IPs is going to be removed in favor of the "location" field. This change will finally take place at 1. of July 2026. Beside of that I simply updated all providers to their latest versions. Reference: https://docs.hetzner.com/networking/dns/faq/beta Reference: https://docs.hetzner.cloud/changelog#2025-11 Reference: https://docs.hetzner.cloud/changelog#2025-12	2026-02-13 00:01:02 +01:00
Felix Nehrke	1f69c1578c	Add longhorn distributed storage to the k3s-cluster This change adds longhorn, an addition to Kubernetes that adds the ability to use distributed storage over all nodes to the cluster. Note, that I tried that in December already but due to very high load on the machines I rolled _everything_ back. Though, it turned out that the high load was not because of longhorn, but instead because of bad configuration of the server, as described in the see-also commit. Reference: https://longhorn.io/ Reference: https://longhorn.io/docs/1.10.1/deploy/install/install-with-helm/ See-also: `4b8a3d12c4` Use etcd instead of sqlite for k3s-server	2026-01-23 00:45:00 +01:00
Felix Nehrke	4b8a3d12c4	Use etcd instead of sqlite for k3s-server Oh damn, that was so annoying. My cluster ran on near 100% load all the time! As it turns out that's a known issue on k3s clusters. The solution is to add the `--cluster-init` flag to the server which let the server use etcd instead of sqlite. And voila the cpu-usage drops to a resonably low level in the single digit percent range. Reference: https://github.com/k3s-io/k3s/issues/10396 Reference: https://docs.k3s.io/datastore/ha-embedded#existing-single-node-clusters	2026-01-22 22:47:58 +01:00
Felix Nehrke	4ac786c5c5	Update gitea chart due to discontinued bitnami db-images Bitnami has discontinued a lot of their container images. Alongside these were also their images for a high availability setup of postgresql. This change fixes that by referencing the legacy bitnami images until a "new" approach is found.	2025-12-14 20:29:32 +01:00
Felix Nehrke	a3e67f9fce	Disable public registriation to my gitea server My gitea-server is basically my safe harbor for private git-projects. It is not meant to be public. Even more important that would shift responsibilities a lot, especially legal liabilities may become important suddenly, when the server is open. Furthermore I can't guarantee a process availability when I cannot make any assumptions about the usage. And, I cannot make such assumptions for an open and public project which I maintain in my spare-time.	2025-12-12 20:26:49 +01:00
Felix Nehrke	bdf20799ce	Move the toc in README below the preamble for better readability	2025-11-28 22:30:25 +01:00
Felix Nehrke	3f40c424fa	Move the table of contents below mirror hint on Github in README	2025-11-28 22:25:35 +01:00
Felix Nehrke	32383b5365	Add table of contents to README The README for this project has grown a lot, so it makes sense to include a table of contents to regain some control.	2025-11-28 22:21:40 +01:00
Felix Nehrke	8923280d4c	Remove note from README that snappass is not ready	2025-11-28 22:14:05 +01:00
Felix Nehrke	20b0ac86f5	Add snappass to the cluster This change is surprisingly tricky and needed some temporary workarounds. First, there is no "official" snappass helm chart but I found one, which does the job and looked good enough. The other problem is the missing "official" image of snappass. The helm-chart used a customized image which I didn't want to use, therefore I had to rebuild a brand new image quickly. This new image is unfortunately not bound to any repository or pipeline yet, which means that this change needs some trust for the moment until I've set up the needed repo and CI structures. Reference: https://github.com/lmacka/helm-snappass/tree/main Reference: https://github.com/pinterest/snappass	2025-11-28 22:12:47 +01:00
Felix Nehrke	f562241b5c	Remove dangling text-fragment from README	2025-11-28 15:37:40 +01:00
Felix Nehrke	6cef6bf868	Fix formatting of templates-tip in README	2025-11-28 15:36:23 +01:00
Felix Nehrke	8afffdb2af	Add emojies as admonition captions for github in README	2025-11-28 15:30:31 +01:00
Felix Nehrke	7c928ac8e3	Add note that the github-repo is only a mirror to the README	2025-11-28 14:30:18 +01:00
Felix Nehrke	6824bd7802	Combine the sections about required software in the README For stupidity reasons I had split up the "Supported Platforms" and the "Required Software" without realising that these are actually entangled. This change fixes that.	2025-11-28 00:28:25 +01:00
Felix Nehrke	cc0e00f1af	Another massive rewrite of the README This change actually alters the readme significantly. The overall goal was to adjust it to an easier to read document, since the previous version had generally outgrown its initial layout. This alone should raise a flag since it could indicate a too long document. But, I want to make sure to understand each detail even after some time off. This new approach is targeting this desire, and improves the overall structure to read the document from top to bottom, as I like it.	2025-11-28 00:28:25 +01:00
Felix Nehrke	70462e1795	Mention the usage of Hetzner Cloud in the README	2025-11-28 00:28:25 +01:00
Felix Nehrke	94d5cc60c0	Enhance the README a lot This change is huge, therefore I only sum up the most important changes: * Improve spelling * Reduce ambiguity * Use OpenTofu instead of Terraform * Document missing tags for Ansible * Provide example-configuration * Fix confusion between dotenv and direnv, I use direnv! * Add section about required software * Many spelling mistakes	2025-11-28 00:28:25 +01:00
Felix Nehrke	91f81b8726	Add concourse as the foundational CI tool to k8s-cluster This change allows to add a concourse-server to the kubernetes cluster.	2025-11-28 00:28:25 +01:00
Felix Nehrke	0eaa5d3b08	Add current IP automatically to whitelists for SSH and Kubernetes After I removed the automatic IP addition to the firewalls for SSH and Kubernetes I ran into a problem only a few days later. My ISP changed my IPs and I was to stupid to realize that immediately. So, this change reintroduces the automatic addition of my current IPs to the whitelists for Kubernetes and SSH. Though, I adjusted the algorithm, so it will not change every day or so, but instead really only when my ISP changes my IPs.	2025-11-28 00:28:25 +01:00
Felix Nehrke	adfa2674c6	Migrate state to base-infra/terraform.tfstate in b2-bucket I renamed the project from "hetzner-infra" to "base-infra", since that better fits the purpose of this repository. So, this change migrates the state name accordingly, to avoid confusion.	2025-11-28 00:28:25 +01:00
Felix Nehrke	e22217f2ed	Adjust formatting according to tf standards	2025-11-28 00:28:25 +01:00
Felix Nehrke	9db5f749d3	Remove TODO to setup gitea, since it's already done	2025-11-28 00:28:25 +01:00
Felix Nehrke	cb7d2712ff	Fix typos in readme	2025-11-28 00:28:25 +01:00
Felix Nehrke	5b97e5268d	Remove plan to setup minio since I moved over to Backblaze	2025-11-28 00:28:25 +01:00
Felix Nehrke	38bfc493b5	Add mandatory .envrc setup-instruction to readme	2025-11-28 00:28:25 +01:00
Felix Nehrke	0cd390e9e5	Simplify abstract of README to better describe the purpose	2025-11-28 00:28:25 +01:00
Felix Nehrke	f43ea3d324	Update readme to emphasize the focus on the base web infrastructure I plan to move over more base tasks to this repository, like maintaining the keys for Backblaze. Therefore I adjusted the readme accordingly. Furthermore I fixed the spelling on sever places.	2025-11-28 00:28:25 +01:00
Felix Nehrke	b33da3eca0	Simplify server and agent definition for kubernetes The definition was split into multiple settings, that made it unnecessary complicated to setup the definition for my kubernetes cluster. This new approach allows for granular definitions of servers and agents and is also simpler to use for me.	2025-11-28 00:28:25 +01:00
Felix Nehrke	58b0c0fcc7	Move declaration of primary IPs into kubernetes-module	2025-11-28 00:28:25 +01:00
Felix Nehrke	cb97668b63	Define IPs which have access to the kubernetes-API and SSH as variables I liked the idea to have these IPs dynamically detected at runtime, though some research showed that my current provider only renews these every 180 days, nowadays. So, no need for such a hyper-dynamic solution. Instead I use a variable now, which brings some other benefits, like adding arbitrary IPs as well. This might become handy in cases of CI/CD.	2025-11-28 00:28:25 +01:00
Felix Nehrke	6ca0a07522	Configure dns-zones via variables, instead of as static values This change makes it a bit easier for me to manage specific domains. Note, that in the long-run these settings should _not_ belong to this repository. Instead I'm going to maintain these in projects where the domain is more meaningful.	2025-11-28 00:28:25 +01:00
Felix Nehrke	4f9ea90f8e	Add gitea as git-server to k8s-cluster By applying this change the kubernetes cluster gets a gitea-server setup. Note, that I use a custom-image which I have to automate in future. The customization is necessary since I use asciidoc very often and the default-gitea doesn't render these files, so it becomes a bit cumbersome to read them on the web.	2025-11-28 00:28:25 +01:00
Felix Nehrke	b16566e021	Move tasks to setup cert-manager into its own task-file This change is the first step to setup further tools, like a git-server or CI-servers with this role.	2025-11-28 00:28:25 +01:00
Felix Nehrke	9d32790c99	Move terraform-state to b2 The terraform-state can be stored in backblaze b2 with some configurations. This changes does exactly this. Note, that this requires the special env-variables `AWS_SECRET_ACCESS_KEY` and `AWS_ACCESS_KEY_ID` which are normally part of the AWS-setup. To be able to use AWS and this setup in parallel I use dotenv to maintain the variables in the special file `.envrc`. Reference: https://andrzejgor.ski/posts/backblaze_b2_tf_state Reference: https://www.reddit.com/r/selfhosted/comments/1iv1qir Reference: https://direnv.net/	2025-11-28 00:28:25 +01:00
Felix Nehrke	18a5d1eae2	Switch from terraform to opentofu, so update some providers therefore	2025-11-28 00:28:25 +01:00
Felix Nehrke	af72ec5cf9	Don't gather facts just to run k8s-setup since it's not needed	2025-11-28 00:28:25 +01:00
Felix Nehrke	f19a1f61c9	Use port 1022 for all cluster nodes as SSH-port and fix some config-errors I'm oversaw completely, that I have to change the SSH-port for all nodes in the cluster otherwise I cannot provide a meaningful load-balancer for the git-ssh port in it. Additionally this allowed me to fix some config errors which I simply oversaw.	2025-11-28 00:28:22 +01:00
Felix Nehrke	f1856f59aa	Fix tags to limit even the reference to roles/playbooks The previous setting of tags still let ansible gather facts for the roles in question, even though they're not executed. This fix prevent this from happening.	2025-11-28 00:24:18 +01:00
Felix Nehrke	af5feca667	Document possible tags for the ansible-playbook The playbook itself is written to be idempotent, so it doesn't hurt to run all tasks many times. But, it doesn't need to run all tasks all the time, therefore you can limit the executional-scope with the documented tags to only affect certain tasks. This improves the performance a lot!	2025-11-28 00:24:18 +01:00
Felix Nehrke	7297892e18	Merge infra and k3 into one directory again Since I don't have multiple terraform steps anymore it simply doesn't make sense to me anymore to split all tasks into separate folders. Instead I try to be as clear as possible in the README to make it easy to follow the structure in the future without too much headache.	2025-11-28 00:24:18 +01:00
Felix Nehrke	fef383fed4	Move setup of foundational service from k8s to k3s It simply doesn't make sense to split the installation of the kubernetes-cluster from the provisioning of foundational services. Therefore I drop the idea to organise these services in another terraform-setup and instead ensure their presence with ansible, as it's already responsible for setting up the cluster and keep it up-to.date.	2025-11-28 00:23:36 +01:00
Felix Nehrke	adec38e1cd	Make ssh-port of servers initially configurable It looks somehow random that the SSH-port was simply defined in the configuration of the k3s-setup. It looks somehow "configurable" although it isn't. Therefore I moved this setting to the correct place in the terraform-setup. An important side-note is that this change doesn't make it possible to _change_ the ssh-port, though. Once decided for an port and I have to stick to it until I tear down the cluster!	2025-09-19 18:03:04 +02:00
Felix Nehrke	9c19a21273	Simplify configuration by moving all the vars into config.ini The navigation through a bunch of config files, all with just a few lines in it is cumbersome. This change moves all the configuration into a centralized `config.ini` that way it's easier for me to get a quick overview of the setup. The `config.ini` acts as another inventory and is therefore references as such by the ansible.cfg. The `inventory.ini` (which is generated by terraform in the provisioning-step) is not affected by this change.	2025-09-19 16:02:27 +02:00
Felix Nehrke	95cc115734	Move download of kube-config into dedicated role	2025-09-19 14:14:25 +02:00
Felix Nehrke	d227c954a6	Rename main.yml to site.yml to match docs and follow common practices	2025-09-18 20:41:26 +02:00
Felix Nehrke	4beb9e2844	Move configuration of servers completely to ansible With this change we no longer use user-data scripts on the provided machines. That makes it way easier for me to handle all the configuration, since I only have to run ansible. Furthermore this the burdon to think what may went wrong, since ansible is easier to debug than some arbitrary scripts which run at provisioning-time on the machines. With this change I should also think about restructuring the code a bit as well. Since it's actually easier to provide the initial software-stack for the cluster via ansible than via terraform, at least as far as I can tell right now.	2025-09-18 20:41:26 +02:00
Felix Nehrke	fda7cac5c0	Only make ssh-port free on k8s-servers since the agents doesn't need to The only reason I even change the port is to make sure a git-client can reach the my upcoming git-servers on the standard ssh-port. Though to achive this I only have to make sure that the port is reacheable on the internet, after that the port is routed through the kubernetes network. This means that my agents can keep using the standard-port, which makes everything easier for me :)	2025-09-18 16:42:21 +02:00
Felix Nehrke	4a818d0c8a	Add a short tl;dr section to the readme for quick setup	2025-09-18 16:00:57 +02:00

1 2

67 Commits