HJK Talos Cluster Setup
Welcome to the documentation for the setup and operation of a single-node Kubernetes cluster based on:
- Talos Linux
- Cilium (CNI)
- Rook Ceph (single-node storage for test and development)
- OS2AI / AarhusAI
- Access from Ubuntu WSL2 on developer workstations
The purpose of this documentation is to provide a consistent, reproducible, and transparent approach to installing, operating, and troubleshooting the cluster, both for current and future team members.
This documentation describes the how and why of the platform.
The actual infrastructure-as-code (IaC) implementation lives in a separate repository.
Documentation structure
Below is an overview of the main documentation sections grouped by responsibility.
Cluster Foundation
Platform Management (GitOps & Control Plane)
Applications (AI Platform Components)
Operations
1. Environment preparation
Hardware requirements, networking, static IP addressing, WSL2 setup, and required tooling.
Chapter: 01 – Environment
2. Talos installation and cluster bootstrap
Booting from ISO, generating cluster configuration, applying patches, and bootstrapping the control plane.
Chapter: 02 – Bootstrap
3. Cilium installation
Installing Cilium as the Kubernetes CNI using Helm, validating networking, and troubleshooting datapath issues.
Chapter: 03 – Cilium
4. Rook Ceph installation
Deploying Rook Ceph in a single-node configuration, configuring StorageClasses, and understanding limitations.
Chapter: 04 – Rook Ceph
5. Cluster access from WSL2
Accessing Talos and Kubernetes from Windows via Ubuntu WSL2, including kubeconfig handling and networking considerations.
Chapter: 05 – WSL Access
6. NVIDIA GPU enablement
Enabling NVIDIA GPUs on Talos, including drivers, container runtime, RuntimeClass, and device plugin.
Chapter: 06 – NVIDIA GPU
7. Argo CD (GitOps)
Bootstrapping GitOps with Argo CD and the app-of-apps pattern (Argo CD Resources), including authentication to private Git repositories.
Chapter: 07 – Argo CD (GitOps)
8. Observability (Prometheus, Grafana, Loki, Tempo)
Deploying and operating the cluster observability stack, including Grafana authentication and datasource wiring for logs and traces.
Chapter: 08 – Observability
9. Sealed Secrets
Managing application secrets securely using Sealed Secrets in a GitOps workflow.
Chapter: 09 – Sealed Secrets
10. vLLM
Deploying GPU-accelerated inference and embedding workloads using vLLM.
Chapter: 10 – vLLM
11. LiteLLM
Deploying LiteLLM as an OpenAI-compatible proxy in front of model backends, including database persistence and guardrails.
Chapter: 11 – LiteLLM
12. Open WebUI
Deploying Open WebUI as the user-facing chat interface, integrated with LiteLLM, vLLM, RAG, and persistence services.
Chapter: 12 – Open WebUI
13. Upgrading applications
How to upgrade platform components and applications using the vendor submodule and local overrides.
Chapter: 13 – Upgrades
16. Talos & Kubernetes Upgrades
Procedure for upgrading Talos OS, talosctl client, and Kubernetes version including NVIDIA system extensions.
Chapter: 16 – Talos & Kubernetes Upgrades
15. Troubleshooting and FAQ
Common failure scenarios related to Talos, Kubernetes, Cilium, Rook Ceph, WSL2, and recovery procedures.
Chapter: 90 – Troubleshooting
Scope and intent
This documentation supports:
- Operation and maintenance of a local single-node Kubernetes cluster for AI workloads
- Internal OS2 projects including OS2AI
- Reproducible infrastructure based on scripted workflows and declarative configuration
- A clear separation between:
- Test / development environments (single-node)
- Future production-grade platforms (multi-node, HA)
Target audience
This documentation is intended for:
- IT operations staff
- System administrators
- Developers responsible for OS2AI and related platforms
- Future municipal technical operations teams
Prerequisites
Readers are expected to have basic knowledge of:
- Linux and WSL2
- Kubernetes fundamentals
- YAML configuration files
- Basic networking concepts
- Helm-based application deployment
Talos-specific concepts and workflows are explained where relevant throughout the documentation.