No description
- Dockerfile 61.3%
- Shell 24%
- Jinja 14.7%
On a single-node cluster, wireguard-native provides no encryption benefit (no node-to-node traffic to encrypt) but reduces pod MTU from 1500 to 1420, causing cascading MTU issues in DinD, wg-easy, and any nested tunnel. host-gw uses direct routing with no encapsulation, giving pods full 1500 MTU. |
||
|---|---|---|
| .vscode | ||
| docker-ansible | ||
| group_vars | ||
| roles | ||
| .gitignore | ||
| .sops.yaml | ||
| .yamllint | ||
| ansible.cfg | ||
| create_server.yml | ||
| INFRASTRUCTURE.md | ||
| README.md | ||
| renovate.json | ||
| run-playbook.sh | ||
| site.yml | ||
Hetzner K3s Cluster
This repository contains the full Kubernetes infrastructure deployment on the Hetzner Cloud.
First Time Setup
- install
hcloud, the Hetzner cloud CLI - Restore the xarif.de AGE Key from Keepass to "$HOME/.sops/key.txt" if not already present
- Create a Hetzner Cloud Project project on https://console.hetzner.cloud/projects
- Store the Hetzner Cloud Projects API Token with Mozilla SOPS under
group_vars/all.sops.yml
Execution
./run-playbook.sh
- This runs the Ansible Playbook in this repo, doing the whole infrastructure deployment.
- If not already setup, the playbook may ask you to provide access to your Hetzner Storage Box by adding two SSH Public Keys to the storage boxes
~/.ssh/authorized_keysfile. - You can do that by
sftp <username>@<username>.your-storagebox.deto your storage boxcd .sshto the ssh directory of your storage boxget authorized_keysto download theauthorized_keysfile- edit the file and add the keys mentioned by the playbook
put authorized_keysto upload the file back to the storage box- finally remove the downloaded file
Complete Server restore
- Deploy Argo CD: https://gitlab.com/xarif/kubernetes/argocd
- Deploy Applications: https://gitlab.com/xarif/kubernetes/argocd-apps
- Restore Application Data: https://gitlab.com/xarif/kubernetes/k8up
Troubleshooting
- The DNS record of the domain provider must point to a worker node ip, not to the master
- Lets Encrypt Certificates takes some time after server restart to be received. First attempt to open a service in the browser could lead to self signed certificates from traefik.
Under the hood
- The
run-playbook.shfirst builds a Docker image fromdocker-ansible, which contains all requirements to run the Playbook. - It then runs the playbook in a container of this image, passing the local Kubernetes Config, SSH Config and SOPS config into the container.
- The SOPS config is required to decrypt
group_vars/all.sops.yamlduring the Playbook run, which is done automatically by the installedcommunity.sopscollection. - The
hcloudtasks in the Playbook don't require the installation of an additional collection. The hcloud collection is per default part of Ansible.
TODO
- mirror external resources
- hetzner dns collection
- k3s ansible role