Multi-Node Installation
Page not available in that version
The current page Multi-Node Installation doesn't exist in version v1.4.1 of the documentation for this product.
Overview
This guide describes the installation of the AgileTV CDN Manager across multiple nodes for production deployments. This configuration provides high availability and horizontal scaling capabilities.
Air-Gapped Deployment? This guide assumes internet connectivity. For air-gapped deployments, see the Air-Gapped Deployment Guide for additional requirements and procedures.
Prerequisites
Hardware Requirements
Refer to the System Requirements Guide for hardware specifications. Production deployments require:
- Minimum 3 Server nodes (Control Plane Only or Combined role)
- Optional Agent nodes for additional workload capacity
Operating System
Refer to the System Requirements Guide for supported operating systems.
Software Access
- Installation ISO:
esb3027-acd-manager-X.Y.Z.iso(for each node) - Extras ISO (air-gapped only):
esb3027-acd-manager-extras-X.Y.Z.iso
Network Configuration
Ensure that required firewall ports are configured between all nodes before installation. See the Networking Guide for complete firewall configuration requirements.
Multiple Network Interfaces
If your nodes have multiple network interfaces and you want to use a separate interface for cluster traffic (not the default route interface), configure the INSTALL_K3S_EXEC environment variable before installing the cluster or joining nodes.
For example, if bond0 has the default route but you want cluster traffic on bond1:
# For server nodes
export INSTALL_K3S_EXEC="server --node-ip 10.0.0.10 --flannel-iface=bond1"
# For agent nodes
export INSTALL_K3S_EXEC="agent --node-ip 10.0.0.20 --flannel-iface=bond1"
Where:
- Mode: Use
serverfor the primary node establishing the cluster, or for additional server nodes. Useagentfor agent nodes joining the cluster. --node-ip: The IP address of the interface to use for cluster traffic--flannel-iface: The network interface name for Flannel VXLAN overlay traffic
Set this variable on each node before running the install or join scripts.
SELinux
If SELinux is to be used, it must be set to “Enforcing” mode before running the installer script. The installer will configure appropriate SELinux policies automatically. SELinux cannot be enabled after installation.
Installation Steps
Step 1: Prepare the Primary Server Node
Mount the installation ISO on the primary server node:
mkdir -p /mnt/esb3027
mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027
Replace X.Y.Z with the actual version number.
Step 2: Install the Base Cluster on Primary Server
If your node has multiple network interfaces and you need to specify a separate interface for cluster traffic, set the INSTALL_K3S_EXEC environment variable before running the installer (see Multiple Network Interfaces):
export INSTALL_K3S_EXEC="server --node-ip <node-ip> --flannel-iface=<interface>"
Run the installer to set up the K3s Kubernetes cluster:
/mnt/esb3027/install
This installs:
- K3s Kubernetes distribution
- Longhorn distributed storage
- Cloudnative PG operator for PostgreSQL
- Base system dependencies
Important: After the installer completes, verify that all system pods in both namespaces are in the Running state before proceeding:
# Check kube-system namespace (Kubernetes core components)
kubectl get pods -n kube-system
# Check longhorn-system namespace (distributed storage)
kubectl get pods -n longhorn-system
All pods should show Running status. If any pods are still Pending or ContainerCreating, wait until they are ready. Proceeding with incomplete system pods can cause subsequent steps to fail in unpredictable ways.
This verification confirms:
- K3s cluster is operational
- Longhorn distributed storage is running
- Cloudnative PG operator is deployed
- All core components are healthy before continuing
Step 3: Retrieve the Node Token
Retrieve the node token for joining additional nodes:
cat /var/lib/rancher/k3s/server/node-token
Save this token for use on additional nodes. Also note the IP address of the primary server node.
Step 4: Server vs Agent Node Roles
Before joining additional nodes, determine which nodes will serve as Server nodes vs Agent nodes:
| Role | Control Plane | Workloads | HA Quorum | Use Case |
|---|---|---|---|---|
| Server Node (Combined) | Yes (etcd, API server) | Yes | Participates | Default production role; minimum 3 nodes |
| Server Node (Control Plane Only) | Yes (etcd, API server) | No | Participates | Dedicated control plane; requires separate Agent nodes |
| Agent Node | No | Yes | No | Additional workload capacity only |
Guidance:
- Combined role (default): Server nodes run both control plane and workloads; minimum 3 nodes required for HA
- Control Plane Only: Dedicate nodes to control plane functions; requires at least 3 Server nodes plus 3+ Agent nodes for workloads
- Agent nodes are required if using Control Plane Only servers; optional if using Combined role servers
- For most deployments, 3 Server nodes (Combined role) with no Agent nodes is sufficient
- Add Agent nodes to scale workload capacity without affecting control plane quorum
Proceed to Step 5 to join Server nodes. Agent nodes are joined after all Server nodes are ready.
Step 5: Join Additional Server Nodes
On each additional server node:
Mount the ISO:
mkdir -p /mnt/esb3027 mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027Join the cluster:
If your node has multiple network interfaces, set the INSTALL_K3S_EXEC environment variable with the server mode before running the join script (see Multiple Network Interfaces):
export INSTALL_K3S_EXEC="server --node-ip <node-ip> --flannel-iface=<interface>"
Run the join script:
/mnt/esb3027/join-server https://<primary-server-ip>:6443 <node-token>
Replace <primary-server-ip> with the IP address of the primary server and <node-token> with the token retrieved in Step 3.
- Verify the node joined successfully:
kubectl get nodes
Repeat for each server node. A minimum of 3 server nodes is required for high availability.
Step 5b: Taint Control Plane Only Nodes (Optional)
If you are using dedicated Control Plane Only nodes (not Combined role), apply taints to prevent workload scheduling:
kubectl taint nodes <node-name> CriticalAddonsOnly=true:NoSchedule
Apply this taint to each Control Plane Only node. Verify taints are applied:
kubectl describe nodes | grep -A 5 "Taints"
Note: This step is only required if you want dedicated control plane nodes. For Combined role deployments, do not apply taints.
Important: Control Plane Only Server nodes can be deployed with lower hardware specifications (2 cores, 4 GiB, 64 GiB) than the installer’s default minimum requirements. If your Control Plane Only Server nodes do not meet the Single-Node Lab configuration minimums (8 cores, 16 GiB, 128 GiB), you must set the
SKIP_REQUIREMENTS_CHECKenvironment variable before running the installer or join command:# For the primary server node export SKIP_REQUIREMENTS_CHECK=1 /mnt/esb3027/install # For additional Control Plane Only Server nodes export SKIP_REQUIREMENTS_CHECK=1 /mnt/esb3027/join-server https://<primary-server-ip>:6443 <node-token>Note: This applies to Server nodes only. Agent nodes have separate minimum requirements.
Step 6: Join Agent Nodes (Optional)
On each agent node:
Mount the ISO:
mkdir -p /mnt/esb3027 mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027Join the cluster as an agent:
If your node has multiple network interfaces, set the INSTALL_K3S_EXEC environment variable with the agent mode before running the join script (see Multiple Network Interfaces):
export INSTALL_K3S_EXEC="agent --node-ip <node-ip> --flannel-iface=<interface>"
Run the join script:
/mnt/esb3027/join-agent https://<primary-server-ip>:6443 <node-token>
- Verify the node joined successfully:
kubectl get nodes
Agent nodes provide additional workload capacity but do not participate in the control plane quorum.
Step 7: Verify Cluster Status
After all nodes are joined, verify the cluster is operational:
1. Verify all nodes are ready:
kubectl get nodes
Expected output:
NAME STATUS ROLES AGE VERSION
k3s-server-0 Ready control-plane,etcd,master 5m v1.33.4+k3s1
k3s-server-1 Ready control-plane,etcd,master 3m v1.33.4+k3s1
k3s-server-2 Ready control-plane,etcd,master 2m v1.33.4+k3s1
k3s-agent-1 Ready <none> 1m v1.33.4+k3s1
k3s-agent-2 Ready <none> 1m v1.33.4+k3s1
2. Verify system pods in both namespaces are running:
# Check kube-system namespace (Kubernetes core components)
kubectl get pods -n kube-system
# Check longhorn-system namespace (distributed storage)
kubectl get pods -n longhorn-system
All pods should show Running status. If any pods are still Pending or ContainerCreating, wait until they are ready.
This verification confirms:
- K3s cluster is operational across all nodes
- Longhorn distributed storage is running
- Cloudnative PG operator is deployed
- All core components are healthy before proceeding to application deployment
Step 9: Air-Gapped Deployments (If Applicable)
If deploying in an air-gapped environment, on each node:
mkdir -p /mnt/esb3027-extras
mount -o loop,ro esb3027-acd-manager-extras-X.Y.Z.iso /mnt/esb3027-extras
/mnt/esb3027-extras/load-images
Step 10: Create Configuration File
Create a Helm values file for your deployment. At minimum, configure the manager hostnames, Zitadel external domain, and at least one router:
# ~/values.yaml
global:
hosts:
manager:
- host: manager.example.com
- host: manager-backup.example.com
routers:
- name: director-1
address: 192.0.2.1
- name: director-2
address: 192.0.2.2
zitadel:
zitadel:
ExternalDomain: manager.example.com
Tip: A complete default values.yaml file is available on the installation ISO at /mnt/esb3027/values.yaml. Copy this file to use as a starting point for your configuration.
Important: The zitadel.zitadel.ExternalDomain must match the first entry in global.hosts.manager or authentication will fail due to CORS policy violations.
Important: For multi-node deployments, Kafka replication is enabled by default with 3 replicas. Do not modify the kafka.replicaCount or kafka.controller.replicaCount settings unless you understand the implications for data durability.
Step 11: Load MaxMind GeoIP Databases (Optional)
If you plan to use GeoIP-based routing or validation features, load the MaxMind GeoIP databases. The following databases are used by the manager:
GeoIP2-City.mmdb- The City DatabaseGeoLite2-ASN.mmdb- The ASN DatabaseGeoIP2-Anonymous-IP.mmdb- The VPN and Anonymous IP Database
A helper utility is provided on the ISO to create the Kubernetes volume:
/mnt/esb3027/generate-maxmind-volume
The utility will prompt for the locations of the three database files and the name of the volume. After running this command, reference the volume in your configuration file:
manager:
maxmindDbVolume: maxmind-db-volume
Replace maxmind-db-volume with the volume name you specified when running the utility.
Tip: When naming the volume, include a revision number or date (e.g.,
maxmind-db-volume-2026-04ormaxmind-db-volume-v2). This simplifies future updates: create a new volume with an updated name, update thevalues.yamlto reference the new volume, and delete the old volume after verification.
Step 12: Configure TLS Certificates (Optional)
For production deployments, configure a valid TLS certificate from a trusted Certificate Authority (CA). A self-signed certificate is deployed by default if no certificate is provided.
Method 1: Create TLS Secret Manually
Create a Kubernetes TLS secret with your certificate and key:
kubectl create secret tls acd-manager-tls --cert=tls.crt --key=tls.key
Method 2: Helm-Managed Secret
Add the certificate directly to your values.yaml:
ingress:
secrets:
acd-manager-tls: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
tls:
- hosts:
- manager.example.com
secretName: acd-manager-tls
Configuring All Ingress Controllers
All ingress controllers must be configured with the same certificate secret and hostname:
ingress:
hostname: manager.example.com
tls: true
secretName: acd-manager-tls
zitadel:
ingress:
tls:
- hosts:
- manager.example.com
secretName: acd-manager-tls
confd:
ingress:
hostname: manager.example.com
tls: true
secretName: acd-manager-tls
mib-frontend:
ingress:
hostname: manager.example.com
tls: true
secretName: acd-manager-tls
Important: The hostname must match the first entry in global.hosts.manager for Zitadel CORS compatibility. The secret name has a maximum length of 53 characters.
Step 13: Deploy the Manager Helm Chart
Deploy the CDN Manager application:
helm install acd-manager /mnt/esb3027/helm/charts/acd-manager --values ~/values.yaml
Note: By default, helm install runs silently until completion. To see real-time output during deployment, add the --debug flag:
helm install acd-manager /mnt/esb3027/helm/charts/acd-manager --values ~/values.yaml --debug
Tip: For better organization, split your configuration into multiple files and specify them with repeated --values flags:
helm install acd-manager /mnt/esb3027/helm/charts/acd-manager \
--values ~/values-base.yaml \
--values ~/values-tls.yaml \
--values ~/values-autoscaling.yaml
Later files override earlier files, allowing you to maintain a base configuration with environment-specific overrides.
Monitor the deployment progress:
kubectl get pods
Wait for all pods to show Running status before proceeding.
Note: The default Helm timeout is 5 minutes. If the installation fails due to a rollout timeout, retry with a larger timeout value:
helm install acd-manager /mnt/esb3027/helm/charts/acd-manager --values ~/values.yaml --timeout 10m
If a previous installation attempt failed and you receive an error that the release name is already in use, uninstall the previous release before retrying:
helm uninstall acd-manager
helm install acd-manager /mnt/esb3027/helm/charts/acd-manager --values ~/values.yaml
Step 14: Verify Deployment
Verify all application pods are running:
kubectl get pods
Note: During the initial deployment, several pods may enter a CrashLoopBackoff state depending on the timing of other containers starting up. This is expected behavior as some services wait for dependencies (such as databases or Kafka) to become available. The deployment should stabilize automatically after a few minutes.
Verify pods are distributed across nodes:
kubectl get pods -o wide
Expected output for a 3-node cluster (pod names will vary):
NAME READY STATUS RESTARTS AGE
acd-cluster-postgresql-1 1/1 Running 0 11m
acd-cluster-postgresql-2 1/1 Running 0 11m
acd-cluster-postgresql-3 1/1 Running 0 10m
acd-manager-5b98d569d9-2pbph 1/1 Running 0 3m
acd-manager-5b98d569d9-m54f9 1/1 Running 0 3m
acd-manager-5b98d569d9-pq26f 1/1 Running 0 3m
acd-manager-confd-6fb78548c4-xnrh4 1/1 Running 0 3m
acd-manager-gateway-8bc8446fc-chs26 1/1 Running 0 3m
acd-manager-gateway-8bc8446fc-wzrml 1/1 Running 0 3m
acd-manager-kafka-controller-0 2/2 Running 0 3m
acd-manager-kafka-controller-1 2/2 Running 0 3m
acd-manager-kafka-controller-2 2/2 Running 0 3m
acd-manager-metrics-aggregator-76d96c4964-lwdcj 1/1 Running 2 3m
acd-manager-mib-frontend-7bdb69684b-6qxn8 1/1 Running 0 3m
acd-manager-mib-frontend-7bdb69684b-pkjrw 1/1 Running 0 3m
acd-manager-redis-master-0 2/2 Running 0 3m
acd-manager-redis-replicas-0 2/2 Running 0 3m
acd-manager-selection-input-5fb694b857-qxt67 1/1 Running 2 3m
acd-manager-zitadel-8448b4c4fc-2pkd8 1/1 Running 0 3m
acd-manager-zitadel-8448b4c4fc-vchp9 1/1 Running 0 3m
acd-manager-zitadel-init-hh6j7 0/1 Completed 0 4m
acd-manager-zitadel-setup-nwp8k 0/2 Completed 0 4m
alertmanager-0 1/1 Running 0 3m
grafana-6d948cfdc6-77ggk 1/1 Running 0 3m
telegraf-54779f5f46-2jfj5 1/1 Running 0 3m
victoria-metrics-agent-dc87df588-tn8wv 1/1 Running 0 3m
victoria-metrics-alert-757c44c58f-kk9lp 1/1 Running 0 3m
victoria-metrics-longterm-server-0 1/1 Running 0 3m
victoria-metrics-server-0 1/1 Running 0 3m
Note: Init pods (such as zitadel-init and zitadel-setup) will show Completed status after successful initialization. This is expected behavior. Some pods may show restart counts as they wait for dependencies to become available.
Step 15: Configure DNS (Optional)
Add DNS records for the manager hostname. For high availability, configure multiple A records pointing to different server nodes:
manager.example.com. IN A <server-1-ip>
manager.example.com. IN A <server-2-ip>
manager.example.com. IN A <server-3-ip>
Alternatively, configure a load balancer to distribute traffic across nodes.
Post-Installation
After installation completes, proceed to the Next Steps guide for:
- Initial user configuration
- Accessing the web interfaces
- Configuring authentication
- Setting up monitoring
Accessing the System
Refer to the Accessing the System section in the Getting Started guide for service URLs and default credentials.
Note: A self-signed SSL certificate is deployed by default. For production deployments, configure a valid SSL certificate before exposing the system to users.
High Availability Considerations
Pod Distribution
The Helm chart configures pod anti-affinity rules to ensure:
- Kafka controllers are scheduled on separate nodes
- PostgreSQL cluster members are distributed across nodes
- Application pods are spread across available nodes
Data Replication and Failure Tolerance
For detailed information on data replication strategies and failure scenario tolerance, refer to the Architecture Guide and System Requirements Guide.
Troubleshooting
If pods fail to start or nodes fail to join:
- Check node status:
kubectl get nodes - Describe problematic pods:
kubectl describe pod <pod-name> - Review logs:
kubectl logs <pod-name> - Check cluster events:
kubectl get events --sort-by='.lastTimestamp'
See the Troubleshooting Guide for additional assistance.
Next Steps
After successful installation:
- Next Steps Guide - Post-installation configuration
- Configuration Guide - System configuration
- Operations Guide - Day-to-day operations