= Web Infra :icons: font This project is meant to setup my base infrastructure for the web. In particular my Kubernetes cluster as well as a base set of software (CI/CD, git-server, etc.) and access-keys. To achieve the goal of having a working base infrastructure for the web the setup is split into 2 dedicated steps: . Create static assets like machines for Kubernetes and access-keys via Terraform . Install/Upgrade Kubernetes-cluster and other software via Ansible. == TL;DR [source,bash] ---- vim .envrc config.auto.tfvars # Get the contents from password-manager dotenv allow terraform init terraform apply sleep 300 # Wait 5 minutes since the machines start _slow_ sometimes ansible-galaxy install -r requirements.yml ansible-playbook site.yml ---- == Preparation . Ensure `terraform` is installed . Ensure `ansible` is installed == Setup The project is split into different steps, each responsible for another task. === Terraform I use Terraform to provide the required infrastructure to run a Kubernetes-cluster. [WARNING] Make sure `.envrc` and `config.auto.tfvars` are present. Then run `dotenv allow` in the directory to apply the `.envrc`. + The files are safely stored in the password-manager. [source,bash] ---- terraform init # <1> terraform apply # <2> ---- <1> Initialize the Terraform modules if necessary <2> Setup infrastructure and create/update inventory.ini [WARNING] The setup will take longer than just the `terraform apply`, since Terraform returns as soon as the machine is provided. Though it hasn't been started the machines, yet. As a rule of thumb wait ca. 5 minutes after the apply to do other work. === Ansible Use Ansible to setup a k3s installation and provide a set of foundational services in the cluster. The provided services are: https://cert-manager.io/docs/installation/helm[cert-manager]:: This allows issuing TLS certificates. The certificates are issued via https://letsencrypt.org[let's encrypt] and can be issued for the staging and production stage of let's encrypt. https://about.gitea.com[gitea]:: My personal favourite git-server. https://concourse-ci.org[concourse-ci]:: A powerful CI-service which I like to use to automate all kind of workloads. + TODO: Not setup yet! https://github.com/pinterest/snappass[snappass]:: A secure and reliable tool to share password. + TODO: Not setup yet! [NOTE] The k3s-setup requires a `inventory.ini` which is automatically created by Terraform. So, make sure to apply the infra at least once, before running these playbooks. [source,bash] ---- ansible-galaxy install -r requirements.yml # <1> ansible-playbook site.yml # <2> ---- <1> Install required Ansible collections to create a k3s-cluster (can be omitted in subsequent runs) <2> Install k3s and download kube-config to .kube/config [IMPORTANT] The second step will override any existing kube config, this might destroy any existing settings! [NOTE] -- To apply the playbook you may need to install additional packages: * https://helm.sh/docs/intro/install/[helm] * https://github.com/databus23/helm-diff?tab=readme-ov-file#install[helm-diff] * python3-kubernetes (Debian/Ubuntu) -- ==== Configured tags init:: Everything needed for the initial setup add-server:: Everything needed to add a new https://docs.k3s.io/cli/server[server] to the cluster add-agent:: Everything needed to add a new https://docs.k3s.io/cli/agent[agent] to the cluster update:: Everything needed to update the cluster config:: Everything needed to update the local kube-config k8s:: Everything needed to provide the foundational services [TIP] The affected scope of the Ansible-playbook can be limited with tags (`--tags tag1,tag2`): == Enlarge / Reduce size of cluster Increase:: -- . Simply adjust the number of agents/servers in your `infra/config.auto.tfvars`. . Then run the Ansible-playbook of k3s again -- Decrease:: -- If you want shrink the cluster **DO NOT** reduce the agent-amount directly! Instead proceed as the following: . Open k9s and go to `:nodes` . Select the highest agent and press `r` to drain it . Afterward that succeeded delete it with `Ctrl-d` . Finally reduce the amount of agents in Terraform and apply the change -- == Responsibilities Terraform:: * Creation of network for the Kubernetes-cluster ** A public subnet exposed to the internet for the Kubernetes-servers ** A private subnet for the Kubernetes-agents * Routing between the networks * Firewall rules to block everything from the servers except of: ** ping (protocol: `icmp`) ** Kubernetes API (Usually port `6443`) ** ssh (I prefer to use a non-standard port (usually port `1022`) ** public services, e.g. http and https (port `80` and `443`) but also git-ssh (port `22`) * Creating the machines for Kubernetes-servers in the public subnet * Creating the machines for Kubernetes-agents in the private subnet * Creating DNS-records in Hetzer Cloud Ansible:: * Setup SSH-connections * Setting up routing on all servers * Installing k3s * Keep the software up-to-date * Add foundational services to the cluster