This is the fourteenth article in our series on running the ClickHouse® database on Kubernetes with the Altinity® Kubernetes Operator. We have built up every piece on its own. Now we assemble them into one production-grade cluster, so you can see how the parts fit and use it as a template for your own deployment.
This article is a capstone. Each block below was explained in its own earlier article, linked as we go, so treat this as the integrated reference.
What we are building
A fault-tolerant analytical cluster: two shards for capacity, two replicas per shard for safety, a three-node Keeper ensemble for coordination, durable and separated storage, credentials and certificates in Secrets, TLS everywhere, deliberate pod placement, sensible resources, and monitoring hooks. This is meant for a real multi-node cluster, not a single-node laptop, because placement rules need several nodes.
Prerequisites
Before applying the manifests, you need a few things in place: the operator installed (the introduction article), a StorageClass backed by fast disks, nodes spread across availability zones, a Kubernetes Secret holding your user credentials, and a TLS Secret holding your certificate and key. Create the namespace and credential Secret like this:
kubectl create namespace prod
kubectl create secret generic ch-credentials -n prod \
--from-literal=analyst_hash='<sha256-of-your-password>'
kubectl create secret tls clickhouse-tls -n prod \
--cert=server.crt --key=server.keyStep 1: The Keeper ensemble
Replication needs Keeper, and production needs three nodes so it survives a failure, as covered in the Keeper article. Save this as keeper-prod.yaml:
apiVersion: "clickhouse-keeper.altinity.com/v1"
kind: "ClickHouseKeeperInstallation"
metadata:
name: keeper
spec:
configuration:
clusters:
- name: keeper
layout:
replicasCount: 3
settings:
keeper_server/tcp_port: "2181"
defaults:
templates:
podTemplate: keeper-pod
volumeClaimTemplate: keeper-data
templates:
podTemplates:
- name: keeper-pod
podDistribution:
- type: ClickHouseAntiAffinity
scope: ClickHouseInstallation
spec:
containers:
- name: clickhouse-keeper
image: clickhouse/clickhouse-keeper:26.3
resources:
requests:
cpu: "1"
memory: 1Gi
limits:
cpu: "2"
memory: 2Gi
volumeClaimTemplates:
- name: keeper-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10GiThree Keeper pods, kept on separate nodes by anti-affinity, each with its own durable volume and bounded resources.
Step 2: The ClickHouse cluster
This is the full CHI, combining storage, users, TLS, placement, and resources. Save it as clickhouse-prod.yaml:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "ch"
spec:
defaults:
templates:
dataVolumeClaimTemplate: data-volume
logVolumeClaimTemplate: log-volume
podTemplate: clickhouse-pod
configuration:
# Coordination: reference the Keeper ensemble by name.
zookeeper:
keeper:
name: keeper
# Security: named user from a Secret, default user removed.
users:
analyst/password_sha256_hex:
valueFrom:
secretKeyRef:
name: ch-credentials
key: analyst_hash
analyst/networks/ip:
- 10.0.0.0/8
files:
users.d/remove_default.xml: |
<clickhouse>
<users>
<default remove="1"/>
</users>
</clickhouse>
# Topology: 2 shards x 2 replicas, TLS enabled.
clusters:
- name: "main"
secure: "yes"
security:
clickhouse:
tls:
verify: Strict
minVersion: "1.3"
layout:
shardsCount: 2
replicasCount: 2
templates:
podTemplates:
- name: clickhouse-pod
metadata:
annotations:
prometheus.io/scrape: "true"
podDistribution:
- type: ClickHouseAntiAffinity
scope: ClickHouseInstallation
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:26.3
resources:
requests:
cpu: "2"
memory: 8Gi
limits:
cpu: "4"
memory: 16Gi
volumeClaimTemplates:
- name: data-volume
spec:
storageClassName: fast-ssd # your production StorageClass
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 200Gi
- name: log-volume
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10GiReading the manifest
Every block here maps to an article you have already read. The zookeeper.keeper reference wires the cluster to the three-node ensemble. The users block pulls the analyst's hashed password from a Secret and restricts its network, while the files block removes the empty-password default user, the security practices from the security article. The cluster's secure flag and tls policy encrypt traffic. The layout defines two shards and two replicas, and the pod template's anti-affinity keeps copies on separate nodes, as in the scaling article. Resources requests and limits size each pod from the resources article, and the data and log volumes use a fast StorageClass from the storage article. The Prometheus annotation ties into the monitoring you set up. Nothing here is new; it is the sum of the series.
Step 3: Deploy and verify
Apply the Keeper first, wait for it, then the cluster:
kubectl apply -n prod -f keeper-prod.yaml
kubectl get chk -n prod -w
kubectl apply -n prod -f clickhouse-prod.yaml
kubectl get chi -n prod -wWhen the CHI reports Completed, run through a quick production checklist. Confirm all pods are running and spread across nodes:
kubectl get pods -n prod -o wideYou should see three Keeper pods and four ClickHouse pods on different nodes. Then create a replicated and distributed table as in the replication article, insert data, and confirm it appears across shards and replicas. Finally, check that monitoring is scraping the cluster and that your Grafana dashboard shows the new nodes.
Production readiness checklist
Before calling a cluster production-ready, verify each of these, every one of which this manifest addresses: coordination on a three-node Keeper, at least two replicas per shard, durable storage on a fast class with a safe reclaim policy, the default user removed and real users sourced from Secrets, TLS enabled, anti-affinity spreading copies across nodes (and ideally zones), resource requests and limits set, a PodDisruptionBudget in place (the operator adds it), and monitoring and alerts wired up. When all of those are true, you have something you can run with confidence.
Clean up
kubectl delete namespace prodWhat is next
You have a complete production cluster. The remaining articles cover advanced and operational topics that build on this foundation. In the next article we lower storage cost with tiered storage, keeping hot data on fast local disks and moving cold data to object storage like Amazon S3.



